From bix at sendu.me.uk Fri Jun 1 04:06:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 01 Jun 2007 09:06:04 +0100 Subject: [Bioperl-l] ClustalW Score? In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><465E9B58.1020403@sendu.me.uk> <49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org> <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> Message-ID: <465FD36C.5060603@sendu.me.uk> Kevin Brown wrote: >> you're right --- it is not really my code, I was just >> elaborating Kevin's example --- it would probably need to be >> more specific or perhaps the last Score seen is sufficient >> for what one is trying to capture? > > I took that code from a pairwise clustal alignment script that I wrote > to deal with aligning a bunch of short sequences against a long one to > see where they line up at. When all of them were fed to Clustal the > short sequences all ended up aligned to each other and not well aligned > to the longer sequence. I only saw one score in the output from the > pairwise, so that is what I used to find a reasonable value. Ok, well I've hedged my bets and used both. Now commited to CVS. From jy at genseq.co.uk Fri Jun 1 22:39:48 2007 From: jy at genseq.co.uk (Jean-Yves Sireau) Date: Sat, 2 Jun 2007 10:39:48 +0800 Subject: [Bioperl-l] Genseq Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com> Dear List members, I would like to let you know of the formation of Genseq Ltd., a bioinformatics company that will (in time!) offer genome sequencing to high net worth individuals and bioinformatic analysis of the sequence data to detect predisposition to illness. The company's website is www.genseq.co.uk Genseq would be willing to sponsor bioperl, whether financially or by providing resources, notably for any bioperl-related activities in the Asia Pacific region. Genseq's bioinformatics team will be based in Cyberjaya (Malaysia), and we are in particular interested to promote bioperl in Malaysia. We are also actively recruiting at the moment in Malaysia and India. If there was sufficient demand, we would be willing to organise a bioperl conference in Cyberjaya at the Cyberview Lodge (www.cyberview-lodge.com), which would be the ideal place for such a conference in Malaysia. Looking forward to your comments, suggestions and proposals. Best regards Jean-Yves Sireau -- Jean-Yves Sireau CEO, Genseq Ltd. www.genseq.co.uk From cjfields at uiuc.edu Sat Jun 2 01:16:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 00:16:05 -0500 Subject: [Bioperl-l] EUtilities overhaul started Message-ID: To anyone using Bio::DB::EUilities, I am in the midst of a major overhaul to the various EUtilities tools and to Bio::DB::GenericWebDBI (the latter which I am forming into more or less a test bed for other database interfaces). I'm about 80% done at this point, and will likely start committing changes this coming week. The overall interface will change (something I had warned about in the Bio::DB::EUtilities POD) but I am hoping it will be more intuitive and easier to use in the long run. I'll describe the overall redesign and use in an upcoming HOWTO (as recommended by Brian a while back). If anyone has any suggestions/ideas/flames, please let me know! Cheers! chris From cjfields at uiuc.edu Sat Jun 2 10:39:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 09:39:25 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: Yes, there are a few odd issues, though that's one I've not heard of yet. You might try one of the sub-nucleotide databases (nuccore, nucest, nucgss). I'll try looking into it and (if necessary) pester NCBI about it. I'll pass this on to the mail list to see if anyone else knows about the problem. chris On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: > Hi Chris, > > Thanks for your work on EUtilities. > For a production task, I used EUtilitities directly (given your > announced overhaul). I noticed a recent problem at NCBI (reported two > weeks ago to NCBI, no reply yet). Possibly you may run into this with > testing: if you ePOST gi ids to the EU server and then use this set in > Esearch (using the query key) no results are returned for the > nucleotide database. > ESearches like "db=$db%23$QueryKey" typically fail if the $db is > nucleotide (but work f $db='protein'). The XML output has Count 0 and > an empty QueryTranslationSet for db=nucleotide only. > For completeness, I attach a simple test script I used. > > > Best regards, > Bernd > > > On 6/2/07, Chris Fields wrote: >> To anyone using Bio::DB::EUilities, >> >> I am in the midst of a major overhaul to the various EUtilities tools >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> more or less a test bed for other database interfaces). I'm about >> 80% done at this point, and will likely start committing changes this >> coming week. >> >> The overall interface will change (something I had warned about in >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> intuitive and easier to use in the long run. I'll describe the >> overall redesign and use in an upcoming HOWTO (as recommended by >> Brian a while back). >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> Cheers! >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Jun 3 00:51:57 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 23:51:57 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu> I can confirm this; however it only relates to the use of history with esearch and nucleotide (use of the history with other eutils seems to work fine); retrieving sequences via efetch is not affected. If I find out anything more I'll post something on the mail list. chris On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote: > I can confirm that using the correct sub-nucleotide database works > (nuccore in my case). > This seems to be a quite recent change/bug at NCBI. Until recently, > db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid > db. > It is not optimal to have to choose the sub-database and the searches > work via the Entrez web-interface. Note that this problem is related > to the ESearch and db=nucleotide. > > bernd > > On 6/2/07, Chris Fields wrote: >> Yes, there are a few odd issues, though that's one I've not heard of >> yet. You might try one of the sub-nucleotide databases (nuccore, >> nucest, nucgss). >> >> I'll try looking into it and (if necessary) pester NCBI about it. >> I'll pass this on to the mail list to see if anyone else knows about >> the problem. >> >> chris >> >> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: >> >> > Hi Chris, >> > >> > Thanks for your work on EUtilities. >> > For a production task, I used EUtilitities directly (given your >> > announced overhaul). I noticed a recent problem at NCBI >> (reported two >> > weeks ago to NCBI, no reply yet). Possibly you may run into this >> with >> > testing: if you ePOST gi ids to the EU server and then use this >> set in >> > Esearch (using the query key) no results are returned for the >> > nucleotide database. >> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is >> > nucleotide (but work f $db='protein'). The XML output has Count >> 0 and >> > an empty QueryTranslationSet for db=nucleotide only. >> > For completeness, I attach a simple test script I used. >> > >> > >> > Best regards, >> > Bernd >> > >> > >> > On 6/2/07, Chris Fields wrote: >> >> To anyone using Bio::DB::EUilities, >> >> >> >> I am in the midst of a major overhaul to the various EUtilities >> tools >> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> >> more or less a test bed for other database interfaces). I'm about >> >> 80% done at this point, and will likely start committing >> changes this >> >> coming week. >> >> >> >> The overall interface will change (something I had warned about in >> >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> >> intuitive and easier to use in the long run. I'll describe the >> >> overall redesign and use in an upcoming HOWTO (as recommended by >> >> Brian a while back). >> >> >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> >> >> Cheers! >> >> >> >> chris >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From basu at pharm.stonybrook.edu Sun Jun 3 10:44:18 2007 From: basu at pharm.stonybrook.edu (Siddhartha Basu) Date: Sun, 03 Jun 2007 10:44:18 -0400 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: On Sat, 2 Jun 2007 00:16:05 -0500 Chris Fields wrote: > To anyone using Bio::DB::EUilities, > > I am in the midst of a major overhaul to the various >EUtilities tools > and to Bio::DB::GenericWebDBI (the latter which I am >forming into > more or less a test bed for other database interfaces). > I'm about > 80% done at this point, and will likely start committing >changes this > coming week. > > The overall interface will change (something I had >warned about in > the Bio::DB::EUtilities POD) but I am hoping it will be >more > intuitive and easier to use in the long run. I'll >describe the > overall redesign and use in an upcoming HOWTO (as >recommended by > Brian a while back). Hi chris, Being a frequent user of EUtilities, hopefully this api facelift and upcoming howto will definitely be more helpful. Anyway, one thing i noticed that for each eutil call such as efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has to be instantiated. And thereafter it cannot be set during runtime such as $eutils->id('ids'), for example.... my $eutils = Bio::DB::Eutilities->new ( -id => $id, -eutil => 'esummary', -db => 'protein', ); my $ct = $eutils->get_response->content(); ## -- now i cannot do this... $eutils->id($newid); my $ct = $eutils->get_response->content(); Is the new api going to address something along this line or is there currently anyway to reuse the object. Thanks again for this nice toolkit. -siddhartha > > If anyone has any suggestions/ideas/flames, please let >me know! > > Cheers! > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Jun 3 19:52:39 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 3 Jun 2007 18:52:39 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu> On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote: > ... > Hi chris, > Being a frequent user of EUtilities, hopefully this api facelift > and upcoming howto will definitely be more helpful. > Anyway, one thing i noticed that for each eutil call such as > efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has > to be > instantiated. And thereafter it cannot be set during runtime such as > $eutils->id('ids'), for example.... > > my $eutils = Bio::DB::Eutilities->new ( -id => $id, > -eutil => 'esummary', > -db => 'protein', > ); > my $ct = $eutils->get_response->content(); > > ## -- now i cannot do this... > $eutils->id($newid); > my $ct = $eutils->get_response->content(); I'll have to check up on that, though changing id() should work with the old API. It won't matter with the new API (it works fine), but it is still troubling... > Is the new api going to address something along this line or is > there currently anyway to reuse > the object. > Thanks again for this nice toolkit. > > -siddhartha The old API was based upon the idea of creating discrete user agents for each eutil to retrieve data. The problem with the old interface is it attempts to do too much (take care of parameters, set up requests, retrieve responses, parse data, etc), and many tasks required instantiating a new EUtilities object. I was never really satisfied with it. The new interface is a composition of three classes: the web user agent (LWP::UserAgent), a class encapsulating parameter handling, and a parser class (all which can be used independently if needed). When parameters change a new request is made 'lazily' (i.e. only when needed). Similarly, when data is requested after any parameter change a new parser instance is created and the new response is parsed. With that in mind you can now do the following: ---------------------------------------- my @params = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA1', -retmax => 100); my $eutil = Bio::DB::EUtilities->new(@params); # no need to get response first; get_ids() calls that if needed my @ids = $eutil->get_ids; # below changes only those parameters, leaves all others set as before $eutil->set_parameters(-eutil => 'efetch', -id => \@ids, -retmode => 'text', -rettype => 'fasta'); # sends streamed content directly to a file $eutil->get_response(-content_file => 'seqs.fas'); # or to a LWP::UserAgent-supported request callback $eutil->get_response(-content_cb => \&my_cb); my @newparams = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA2', -retmax => 100); # Resets eutility to passed parameters (or undef) $eutil->reset_parameters(@newparams); # retrieve new IDs my @new_ids = $eutil->get_ids; ---------------------------------------- Note the same eutil object is used for all of the above, so to answer your last question, yes, you should be able to create data pipelines using the same object if necessary. chris From sac at bioperl.org Mon Jun 4 13:56:57 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 4 Jun 2007 10:56:57 -0700 Subject: [Bioperl-l] question about Bio::Restriction::Analysis In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com> Hi Apurva, I'm cc:ing the list to let others know you have found performance issues with Bio::Restriction::Analysis. Ideally, we should focus on addressing those issues rather than fixing a module that is now deprecated. But taking a quick look at my Bio::Tools::RestrictionEnzyme module, I'm not sure why HpaII would give slower performance relative to other non-ambiguous cutters. This enzyme has a 4-base recognition sequence CCGG, and if you're feeding it a large CG-rich input sequence, that could be a factor. To test, you might try using some other 4-base cutters that aren't CG-rich (TaqI, TasI) or try some other input sequences. There is no special flag to indicate that the enzyme is non-ambiguous. The module handles that automatically. Good luck, Steve On 6/4/07, Apurva Narechania wrote: > Hi Rob and Steve, > > I was hoping you could answer a quick performance question regarding > the Bio::Restriction::Analysis module. I have found that though this > module works well, it is considerably slower than the deprecated > Bio::Tools::RestrictionEnzyme. I see that there are two algorithms > available to your module, and since I am using HpaII, a non-ambiguous > enzyme, I thought I might find similar performance to the older, > deprecated module, but I do not. Is it possible that I am not setting > the non-ambiguous flag correctly? Does it need to be set in the first > place? > > As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have > found instances where it is inaccurate, especially in calculating > fragments of extremely small size 1-5 base pairs, so I would like to > use your module if possible. It just seems slow to me. > > Can you clarify? > > I have copied my code below since it is a short, simple script. > > Thanks! > Apurva Narechania > Ware Lab > Cold Spring Harbor Labs > > ---------- > > #!/usr/bin/perl > > # This program generates a fasta of restriction frags given an > # input fasta and a restriction cut site > > use Getopt::Std; > use Bio::Seq; > use Bio::SeqIO; > use strict; > > use Bio::Tools::RestrictionEnzyme; > > my %opts = (); > getopts ('f:', \%opts); > my $fasta = $opts{'f'}; > > # read fasta file > my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta"); > > my $x = 0; > while (my $sequence_obj = $seqin -> next_seq()){ > $x++; > my $id = $sequence_obj->id(); > > print STDERR "$x Working on $id\n"; > > # generate the rx object > my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII'); > > my @frags = $ra->cut_seq($sequence_obj); > > my $counter = 0; > foreach my $frag (@frags){ > $counter++; > my $length = length ($frag); > print ">$id.$counter length=$length\n$frag\n"; > } > > } > > From anhthu.tieu at gsf.de Tue Jun 5 04:14:09 2007 From: anhthu.tieu at gsf.de (Tieu, Anh-Thu) Date: Tue, 5 Jun 2007 10:14:09 +0200 Subject: [Bioperl-l] problems with image maps and IE 6 or higher Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de> Hi, I have a problem using the bioperl image maps function with the IE6 or and higher browser. It might be a more general problem with IE6 rather than with bioperl, but as I used bioperl to create my image maps, I thought I could still post this problem here and ask for people's opinion. I wondered if anyone else faced the same problem and if possible if anyone could share their experiences and their solutions.

scale alignment5 integration_pt gene intron1 usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/>

> > onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " > alt="scale " target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="alignment5 " alt="alignment5 " > target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="integration_pt " alt="integration_pt " > target="_blank"/> > onclick="javascript:void(zmenu( 'Nphs1 ', > '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', ' > stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " > alt="gene " target="_blank"/> > onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: > 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a > lt="exon1 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: > 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1 > " alt="intron1 " target="_blank"/> > onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: > 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a > lt="exon2 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: > 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2 > .. >
> > > This is part of the code I used in my HTML file to display the image map > and it really runs beautifully > with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 > the clickable pop-ups do not appear/ work. > > I appreciate any help and would like to thank everyone for their help. > > Best regards, > > > Anh-Thu > ________________________________________________________________________ > GSF-Forschungszentrum > > Ingolst?dter Landstr. 1 > > 85764 M?nchen-Neuherberg, Germany > > Chairman of Supervisory Board: MinDir Dr. Peter Lange > > Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum > > Register of Societies: Amtsgericht M?nchen HRB 6466 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Tue Jun 5 11:28:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 10:28:24 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Martin, The example file you give in the bioperl bugzilla report has several blank annotation lines which may lead to additional problems. When the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, DEFINITION, etc) then it expects there will also be relevant data (text descriptions) accompanying it; I assume the BioPython parser expects likewise though I may be wrong. AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- compliant. GenBank records lacking text either have a '.' instead or are left out entirely: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html We could add a fix but you should probably contact the ApE developers and request that field names w/o text be left out or have '.' added. chris On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > Ezequiel Panepucci wrote: >>> genbank entry = parser.parse(fhandle) >> >> there is a space character between "genbank" and "entry". >> It is a syntax error. >> I suppose you meant "genbank_entry" ? > > Yes, the next command was right and has shown the error. Sorry, I > forgot > to delete the first attempt. ;-) > >>>> genbank_entry = parser.parse(fhandle) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", > line 187, in parse > self._scanner.feed(handle, self._consumer) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 360, in feed > self._feed_first_line(consumer, self.line) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 835, in _feed_first_line > assert False, \ > AssertionError: Did not recognise the LOCUS line layout: > LOCUS 6499 bp ds-DNA linear 02-AUG-2006 > >>>> > > Martin > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From stewarta at nmrc.navy.mil Tue Jun 5 11:34:14 2007 From: stewarta at nmrc.navy.mil (Andrew Stewart) Date: Tue, 5 Jun 2007 11:34:14 -0400 Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil> I see bidirectional mutator methods for source, type, strand, etc. in the Bio::DB::GFF::Feature documentation but I see that ->attributes is only able to get and not set the feature attributes. Is there no way to modify the attributes of a Bio::DB::GFF::Feature live? -- Andrew Stewart Research Assistant, Genomics Team Navy Medical Research Center (NMRC) Biological Defense Research Directorate (BDRD) BDRD Annex 12300 Washington Avenue, 2nd Floor Rockville, MD 20852 email: stewarta at nmrc.navy.mil phone: 301-231-6700 Ext 270 From cjfields at uiuc.edu Tue Jun 5 12:07:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 11:07:41 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: One thing I missed which explains the biopython error: the LOCUS line is missing the locus identifier (see the NCBI example record link). This doesn't choke the bioperl parser but it appears to stop the biopython parser in it's tracks (maybe a feature instead of a bug!). You should try adding a unique identifier (maybe the name of the file or record) to the LOCUS line to see if it works: LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 The bioperl parser in CVS writes out the correct alphabet when this is added: LOCUS testfile 6499 bp ds-DNA linear 02- AUG-2006 I'll try adding a warning to the bioperl parser for this. chris On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > Martin, > > The example file you give in the bioperl bugzilla report has several > blank annotation lines which may lead to additional problems. When > the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, > DEFINITION, etc) then it expects there will also be relevant data > (text descriptions) accompanying it; I assume the BioPython parser > expects likewise though I may be wrong. > > AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- > compliant. GenBank records lacking text either have a '.' instead or > are left out entirely: > > http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html > > We could add a fix but you should probably contact the ApE developers > and request that field names w/o text be left out or have '.' added. > > chris > > On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > >> Ezequiel Panepucci wrote: >>>> genbank entry = parser.parse(fhandle) >>> >>> there is a space character between "genbank" and "entry". >>> It is a syntax error. >>> I suppose you meant "genbank_entry" ? >> >> Yes, the next command was right and has shown the error. Sorry, I >> forgot >> to delete the first attempt. ;-) >> >>>>> genbank_entry = parser.parse(fhandle) >> Traceback (most recent call last): >> File "", line 1, in ? >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >> line 187, in parse >> self._scanner.feed(handle, self._consumer) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 360, in feed >> self._feed_first_line(consumer, self.line) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 835, in _feed_first_line >> assert False, \ >> AssertionError: Did not recognise the LOCUS line layout: >> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >> >>>>> >> >> Martin >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Tue Jun 5 22:00:34 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Tue, 05 Jun 2007 22:00:34 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: I am wondering if I knew what this error message exactly meant, if I could discern my error. I don't see much difference in this program and programs that worked. Can I assume that the new worked because an index file exists? I don't know how the filehandle UTR_TT_GENES gets involved. Maybe I should use some other module, but I really would like to have get_Seq_by_id functionality. The error message: Dpse ortholog = Dpse_GA17307 fetching GA17307 Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, line 4. Relevant code: #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; # my $db = Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol ogs_Dpse_genes.fa', -makeid => \&make_my_id); ... ... ... my $pse_obj = $db->get_Seq_by_id('GA17307'); my $pse_sequence = $pse_obj->seq; Nick Staffa Telephone: 919-316-4569 (NIEHS: 6-4569) Scientific Computing Support Group NIEHS Information Technology Support Services Contract (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina From jason at bioperl.org Tue Jun 5 23:12:40 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 5 Jun 2007 20:12:40 -0700 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: the file handle is probably not important, Perl just reports this if there is a filehandle open. more importantly what is on line 84.... my guess is you are trying to get a sequence out and it doesn't exist - some error code around the lines getting the sequence out would be helpful. On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote: > I am wondering if I knew what this error message exactly meant, if > I could > discern my error. > I don't see much difference in this program and programs that worked. > Can I assume that the new worked because an index file exists? > I don't know how the filehandle UTR_TT_GENES gets involved. > Maybe I should use some other module, but I really would like to have > get_Seq_by_id functionality. > > The error message: > Dpse ortholog = Dpse_GA17307 > fetching GA17307 > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl > line 84, > line 4. > > Relevant code: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > # > my $db = > Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ > TT_orthol > ogs_Dpse_genes.fa', > -makeid => \&make_my_id); > ... > ... > ... > my $pse_obj = $db->get_Seq_by_id('GA17307'); > my $pse_sequence = $pse_obj->seq; > > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2613 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0001.bin From torsten.seemann at infotech.monash.edu.au Wed Jun 6 02:06:37 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 6 Jun 2007 16:06:37 +1000 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: Nick, > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, The error makes it pretty clear. You are calling the ->seq method on an undefined value, ie. $pse_obj. > my $pse_obj = $db->get_Seq_by_id('GA17307'); # check we got something! die "sequence not in database" unless $pse_obj; > my $pse_sequence = $pse_obj->seq; -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From shameer at ncbs.res.in Wed Jun 6 02:27:42 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST) Subject: [Bioperl-l] Validation of files using BioPerl Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Dear All, How to validate an input file in fasta/PIR/GenPept/PDB format using Bioperl ? (This is to avoid unnecessary files to be submitted to servers by new users). Any module available ? Many thanks in advance, -- Shameer Khadar From cjfields at uiuc.edu Wed Jun 6 08:37:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 07:37:28 -0500 Subject: [Bioperl-l] Validation of files using BioPerl In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu> It has been discussed but never coded. I believe if it passes through the Bio::SeqIO parser it's generally considered validly formatted (spacing, balanced quotes), though it doesn't specifically check FT keys and qualifiers for invalid ones, look for missing annotation, check taxonomy, etc. As long as the end sequence mark (//) is present for every file, you cold try parsing the file into chunks (read with 'local $/ = '//';') and tossing the seq chunks as a filehandle (via IO::String) to a Bio::SeqIO object wrapped in an eval block (the parser resets $/, so it should work). Follow the eval with a check of $@ for caught errors. It might get tedious for big sequences... chris On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote: > Dear All, > > How to validate an input file in fasta/PIR/GenPept/PDB format using > Bioperl ? (This is to avoid unnecessary files to be submitted to > servers > by new users). Any module available ? > > Many thanks in advance, > -- > Shameer Khadar > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Wed Jun 6 10:40:49 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Wed, 06 Jun 2007 10:40:49 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: Indeed. One must know what is actually in his header, AND one must write the appropriate make_id subroutine AND one must specify the exact ID. THEN things might work. And they did! THANK YOU On 6/6/07 2:06 AM, "Torsten Seemann" wrote: > Nick, > >> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, > > The error makes it pretty clear. You are calling the ->seq method on > an undefined value, ie. $pse_obj. > >> my $pse_obj = $db->get_Seq_by_id('GA17307'); > > # check we got something! > die "sequence not in database" unless $pse_obj; > >> my $pse_sequence = $pse_obj->seq; > From jaudall at gmail.com Wed Jun 6 17:51:33 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:51:33 -0600 Subject: [Bioperl-l] blastxml interation Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being possibly useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number. Thanks in advance for any suggestions. Josh From dmessina at wustl.edu Wed Jun 6 18:18:26 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 6 Jun 2007 17:18:26 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: I think you want to look at the hits(), num_hits() and no_hits_found () methods. There is a private method _next_iteration_index() which should do what you asked for, but num_hits() looks like the better way. By the way, hits() and num_hits() are listed on the Deobfuscator as having no documentation. This (as the below shows) is incorrect and is due to some nonstandard formatting issues which I will correct. _next_iteration_index() isn't listed on the Deobfuscator because it's a private method. Hope this helps! Dave hits() This method overrides Bio::Search::Result::GenericResult::hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, all 'new' hits for all iterations are returned. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::hits num_hits() This method overrides Bio::Search::Result::GenericResult::num_hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, calling num_hits() returns the number of 'new' hits for each iteration. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::num_hits no_hits_found() Usage : $nohits = $blast->no_hits_found( $iteration_number ); Purpose : Get boolean indicator indicating whether or not any hits were present in the report. This is NOT the same as determining the number of hits via the hits() method, which will return zero hits if there were no hits in the report or if all hits were filtered out during the parse. Thus, this method can be used to distinguish these possibilities for hitless reports generated when filtering. Returns : Boolean Argument : (optional) integer indicating the iteration number (PSI- BLAST) If iteration number is not specified and this is a PSI- BLAST result, then this method will return true only if all iterations had no hits found. From apurva at cshl.edu Wed Jun 6 19:51:45 2007 From: apurva at cshl.edu (Apurva Narechania) Date: Wed, 6 Jun 2007 19:51:45 -0400 Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu> Hi, I was hoping you could confirm and give me some feedback on an issue I think I've found with the Bio::Restriction::Analysis module. I am using the enzyme AciI, a non-palindromic restriction enzyme with a 5' C | CGC 3' recognition site. The module should search both the forward and the reverse complement strings in the case of a non- palindromic enzyme. I have found that the this works only intermittently. For example, the following sequence: GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG CGCGGTTG GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG GCTGGTAT TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC AGGACACC GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA CAAAGTGA CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG CAATGTAT ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA TAATGCTA GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC Should digest into 10 fragments using this enzyme, but the module produces only 7. Could you please confirm this behavior, and if observed, suggest some possible fixes? This may be a bug in the _non_pal_enz method, or may be me overlooking something pretty obvious. Thanks, Apurva Narechania. From cjfields at uiuc.edu Wed Jun 6 20:51:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 19:51:00 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: Joshua, Just to make sure there is no confusion, do you mean a Bio::Search::Iteration::IterationI-based object? The iteration tags have multiple meanings apparently in BLAST XML output (multiple queries, multiple PSI-BLAST iterations). The current SearchIO::blastxml parser returns multiple Bio::Search::Result::BlastResult objects based on the iterations, so PSI-BLAST output is treated as multiple BLAST reports regardless (i.e. no Iteration objects). This is something I want to rectify but it may not be a easy fix. chris On Jun 6, 2007, at 5:18 PM, David Messina wrote: > I think you want to look at the hits(), num_hits() and no_hits_found > () methods. There is a private method _next_iteration_index() which > should do what you asked for, but num_hits() looks like the better > way. > > By the way, hits() and num_hits() are listed on the Deobfuscator as > having no documentation. This (as the below shows) is incorrect and > is due to some nonstandard formatting issues which I will correct. > _next_iteration_index() isn't listed on the Deobfuscator because it's > a private method. > > > Hope this helps! > Dave > > > hits() > > This method overrides Bio::Search::Result::GenericResult::hits to take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, all 'new' hits for all iterations > are returned. > These are the hits that did not occur in a previous iteration. > See Also: Bio::Search::Result::GenericResult::hits > > num_hits() > > This method overrides Bio::Search::Result::GenericResult::num_hits to > take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, calling num_hits() returns the > number of > 'new' hits for each iteration. These are the hits that did not occur > in a previous iteration. > See Also: Bio::Search::Result::GenericResult::num_hits > > no_hits_found() > > Usage : $nohits = $blast->no_hits_found( $iteration_number ); > Purpose : Get boolean indicator indicating whether or not any hits > were present in the report. > This is NOT the same as determining the number of > hits via > the hits() method, which will return zero hits if there > were no > hits in the report or if all hits were filtered out > during the parse. > > Thus, this method can be used to distinguish these > possibilities > for hitless reports generated when filtering. > > Returns : Boolean > Argument : (optional) integer indicating the iteration number (PSI- > BLAST) > If iteration number is not specified and this is a PSI- > BLAST result, > then this method will return true only if all > iterations had > no hits found. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 6 20:45:14 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 6 Jun 2007 20:45:14 -0400 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db Message-ID: I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. A schema in PostgreSQL is more or less a namespace for database objects (tables, indexes, views, etc) within a database. (A database in PostgreSQL is similar to the concept of a user in Oracle or MySQL, and therefore for the latter two schemas are synonymous with a user. [Not sure I'm still up-to-date on this for MySQL, but at least that's what I recall.]) When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you specify the schema in which BioSQL resides using the --schema option. If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call also accepts a -schema named parameter, and Bio::DB::DBContextI objects have a $dbc->schema() property for getting/setting the schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may also add the property to the .bioperldb connection parameter file (-schema => 'yourschemahere'). Thanks for Brian Osborne for being the instigator (and tester, and for adding the code to load_ncbi_taxonomy.pl - I came too late). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jaudall at gmail.com Wed Jun 6 17:41:08 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:41:08 -0600 Subject: [Bioperl-l] blastxml interation number Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being very useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number, otherwise I'm suggesting that an iteration_count feature be added to the Result object. Thanks in advance for any suggestions. Josh From holland at ebi.ac.uk Thu Jun 7 03:33:25 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Thu, 07 Jun 2007 08:33:25 +0100 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db In-Reply-To: References: Message-ID: <4667B4C5.6070107@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sounds great. BioJava users shouldn't need to change anything to get this to work as PostgreSQL JDBC connection objects already require you to specify a schema. cheers, Richard Hilmar Lapp wrote: > I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. > A schema in PostgreSQL is more or less a namespace for database objects > (tables, indexes, views, etc) within a database. > > (A database in PostgreSQL is similar to the concept of a user in Oracle > or MySQL, and therefore for the latter two schemas are synonymous with a > user. [Not sure I'm still up-to-date on this for MySQL, but at least > that's what I recall.]) > > When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you > specify the schema in which BioSQL resides using the --schema option. > > If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call > also accepts a -schema named parameter, and Bio::DB::DBContextI objects > have a $dbc->schema() property for getting/setting the schema, > Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may > also add the property to the .bioperldb connection parameter file > (-schema => 'yourschemahere'). > > Thanks for Brian Osborne for being the instigator (and tester, and for > adding the code to load_ncbi_taxonomy.pl - I came too late). > > -hilmar > --=========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij W/+0iO/ZsNDn1pLuf5yXbYA= =asUn -----END PGP SIGNATURE----- From mmokrejs at ribosome.natur.cuni.cz Thu Jun 7 10:26:44 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 07 Jun 2007 16:26:44 +0200 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz> Hi, Chris Fields wrote: > One thing I missed which explains the biopython error: the LOCUS line is > missing the locus identifier (see the NCBI example record link). This > doesn't choke the bioperl parser but it appears to stop the biopython > parser in it's tracks (maybe a feature instead of a bug!). > > You should try adding a unique identifier (maybe the name of the file or > record) to the LOCUS line to see if it works: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > The bioperl parser in CVS writes out the correct alphabet when this is > added: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > I'll try adding a warning to the bioperl parser for this. I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me emphasize the LOCUS line now contains LOCUS pRL 5428 bp ds-DNA linear 07-JUN-2007 which still does not comply with the line you have proposed. But it can be parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new in the bugzilla record #2305. Martin > > chris > > On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > >> Martin, >> >> The example file you give in the bioperl bugzilla report has several >> blank annotation lines which may lead to additional problems. When >> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, >> DEFINITION, etc) then it expects there will also be relevant data >> (text descriptions) accompanying it; I assume the BioPython parser >> expects likewise though I may be wrong. >> >> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- >> compliant. GenBank records lacking text either have a '.' instead or >> are left out entirely: >> >> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html >> >> We could add a fix but you should probably contact the ApE developers >> and request that field names w/o text be left out or have '.' added. >> >> chris >> >> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: >> >>> Ezequiel Panepucci wrote: >>>>> genbank entry = parser.parse(fhandle) >>>> >>>> there is a space character between "genbank" and "entry". >>>> It is a syntax error. >>>> I suppose you meant "genbank_entry" ? >>> >>> Yes, the next command was right and has shown the error. Sorry, I >>> forgot >>> to delete the first attempt. ;-) >>> >>>>>> genbank_entry = parser.parse(fhandle) >>> Traceback (most recent call last): >>> File "", line 1, in ? >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >>> line 187, in parse >>> self._scanner.feed(handle, self._consumer) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 360, in feed >>> self._feed_first_line(consumer, self.line) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 835, in _feed_first_line >>> assert False, \ >>> AssertionError: Did not recognise the LOCUS line layout: >>> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >>> >>>>>> >>> >>> Martin >>> _______________________________________________ >>> BioPython mailing list - BioPython at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biopython >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From cjfields at uiuc.edu Thu Jun 7 11:31:45 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 7 Jun 2007 10:31:45 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> <466815A4.9060505@ribosome.natur.cuni.cz> Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu> On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote: > Hi, > > Chris Fields wrote: >> One thing I missed which explains the biopython error: the LOCUS >> line is missing the locus identifier (see the NCBI example record >> link). This doesn't choke the bioperl parser but it appears to >> stop the biopython parser in it's tracks (maybe a feature instead >> of a bug!). >> You should try adding a unique identifier (maybe the name of the >> file or record) to the LOCUS line to see if it works: >> LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 >> The bioperl parser in CVS writes out the correct alphabet when >> this is added: >> LOCUS testfile 6499 bp ds-DNA linear 02- >> AUG-2006 >> I'll try adding a warning to the bioperl parser for this. > > I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 > but let me > emphasize the LOCUS line now contains > LOCUS pRL 5428 bp ds-DNA linear > 07-JUN-2007 > > > which still does not comply with the line you have proposed. But it > can be > parsed by bioperl-live from cvs. Is it still wrong? Testcase as > pRL.gb-new > in the bugzilla record #2305. > > Martin That should work. There isn't a strict uniqueness test (that would require caching and isn't worth the trouble IMHO), though it's required you add something unique for the accession/locus if you plan on indexing them in the future. Parsing GenBank data produced from third-party software is problematic at best; there seems to be no steadfast rule with GenBank output for some programs, even though the specification is plainly stated in the NCBI release notes. My take on that is to have a stricter (read:follows release notes) GenBank parser which passes off the data in the record to default handler methods. A user could then subjugate the defined handlers with their own by subclassing the default handler class and overloading the methods or adding their own code references directly. chris ... From rich at thevillas.eclipse.co.uk Fri Jun 8 07:00:45 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 12:00:45 +0100 Subject: [Bioperl-l] protparam Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk> Hi, I noticed that in April someone asked whether there was a bioperl mod for obtaining protein sequence related properties using protparam. I have a module that could potentially be submitted to bioperl for this purpose. Does anybody have any thoughts on whether it should go in? Example script and the module are at: http://81.5.159.173/webshare/ Cheers Rich From cjfields at uiuc.edu Fri Jun 8 08:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 07:37:27 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Richard, We'll gladly add this in, though it'll need to be bioperlized (inherit Bio::Root::Root). We also generally ask for tests but it should be easy to write up a quick test suite using any protein seq. If you can could you add some bioperl-like POD to the module (i.e. SYNOPSIS, AUTHOR, DESCRIPTION, etc)? thanks! chris On Jun 8, 2007, at 6:00 AM, richard wrote: > > Hi, > > I noticed that in April someone asked whether there was a bioperl mod > for obtaining protein sequence related properties using protparam. > I have a module that could potentially be submitted to bioperl for > this > purpose. Does anybody have any thoughts on whether it should go in? > > Example script and the module are at: > > http://81.5.159.173/webshare/ > > > Cheers > Rich > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From mmokrejs at ribosome.natur.cuni.cz Fri Jun 8 07:09:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 08 Jun 2007 13:09:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz> Hi, how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for Bio::Graphics::FeatureFile does not help me in this way. The information is in the file, so I want just to extract the features to a GFF format, probably somewhere the sequence has to be stored ... Is there a tool so I can convert it automatically? ;) This would be great. I can't make the GFF manually for every file. Other programs draw plasmid maps also automatically from the GenBank formatted input so how can I do it in bioperl? Thanks for help, Martin From shameer at ncbs.res.in Fri Jun 8 10:11:00 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST) Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in> Richard, I asked for protparam module in bioperl ! Thats a good job. Cheers, SK > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > >> >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From dmessina at wustl.edu Fri Jun 8 10:58:20 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 8 Jun 2007 09:58:20 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Hi Martin, You're in luck -- the BioPerl core distribution includes two scripts for doing just that: genbank2gff genbank2gff3 Look in the scripts directory of the distro. Also, there is a *huge* amount of documentation and examples on the BioPerl website. http://www.bioperl.org/wiki/HOWTOs Reading those, reading the FAQ, and searching the mailing list archives are where I look first when I don't know how to do something in BioPerl. Dave -- Dave Messina Senior Analyst, Assembly Group Genome Sequencing Center Washington University St. Louis, MO From rich at thevillas.eclipse.co.uk Fri Jun 8 11:51:21 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 16:51:21 +0100 Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk> Hi, ok, great, that's no problem. I'll add the POD and bioperlize it, thanks Rich Chris Fields wrote: > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > > >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Fri Jun 8 13:45:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 12:45:17 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> <46697AF9.2090502@thevillas.eclipse.co.uk> Message-ID: Another issue is namespace. I suggest Bio::Tools::ProtParam, though there may be some others out there. We can add support for direct Bio::Seq/PrimarySeq input and other odds and ends once it's committed. Good work! chris On Jun 8, 2007, at 10:51 AM, richard wrote: > > Hi, > > ok, great, that's no problem. I'll add the POD and bioperlize it, > > thanks > Rich > > Chris Fields wrote: >> Richard, >> >> We'll gladly add this in, though it'll need to be bioperlized >> (inherit Bio::Root::Root). We also generally ask for tests but it >> should be easy to write up a quick test suite using any protein seq. >> >> If you can could you add some bioperl-like POD to the module (i.e. >> SYNOPSIS, AUTHOR, DESCRIPTION, etc)? >> >> thanks! >> >> chris >> >> On Jun 8, 2007, at 6:00 AM, richard wrote: >> >> >>> Hi, >>> >>> I noticed that in April someone asked whether there was a bioperl >>> mod >>> for obtaining protein sequence related properties using protparam. >>> I have a module that could potentially be submitted to bioperl for >>> this >>> purpose. Does anybody have any thoughts on whether it should go in? >>> >>> Example script and the module are at: >>> >>> http://81.5.159.173/webshare/ >>> >>> >>> Cheers >>> Rich >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 11 07:30:24 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 11 Jun 2007 07:30:24 -0400 Subject: [Bioperl-l] script to load ITIS taxonomy Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Hi all - I added a script to load the ITIS taxonomy (www.itis.gov) into the phylodb module. It is called load_itis_taxonomy.pl and is in the scripts/ directory. It is independent of BioPerl right now (the ITIS download is either a MS SQL Server or an Informix dump - no kidding), but I'm hoping that at some point support for this can be integrated into Bio::TreeIO. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 11 08:24:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 11 Jun 2007 07:24:50 -0500 Subject: [Bioperl-l] script to load ITIS taxonomy In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu> On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote: > Hi all - > > I added a script to load the ITIS taxonomy (www.itis.gov) into the > phylodb module. It is called load_itis_taxonomy.pl and is in the > scripts/ directory. > > It is independent of BioPerl right now (the ITIS download is either a > MS SQL Server or an Informix dump - no kidding), but I'm hoping that > at some point support for this can be integrated into Bio::TreeIO. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== I second the TreeIO support. Anyone up for it? chris From ryanx07 at hotmail.com Mon Jun 11 11:24:31 2007 From: ryanx07 at hotmail.com (L Xu) Date: Mon, 11 Jun 2007 10:24:31 -0500 Subject: [Bioperl-l] basic questions Message-ID: I just started to learn BioPerl by reading the BioPerl Tutorial on the BioPerl website. By trying the 1st example on my window, use Bio::Perl; $seq_object = get_sequence('swiss',"ID ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); I got the error as the following: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: t8.pl:7 I cannot figure out where is wrong but cannot find the solution on the web. Could someone help me please? Also, this lead to my 2nd question: is there a way to search in the archieve of the current list? Thanks so much R ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Like puzzles? Play free games & earn great prizes. Play Clink now. http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2 From dmessina at wustl.edu Mon Jun 11 12:34:29 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 11:34:29 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu> The example code works here, but I'm on OS X. Could you tell us which version of Perl and BioPerl you are using, and which operating system? Are you getting anything in the roa1.fasta file? > is there a way to search in the archieve of the current list? http://www.bioperl.org/wiki/Mailing_lists Dave From dmessina at wustl.edu Mon Jun 11 14:48:23 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 13:48:23 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Hi, Please use 'Reply All' so everyone on the list can follow the discussion. Try adding the following line after the line that starts with $seq_object: print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; And then run the program again. What do you get? Could you post a complete printout of what you're doing? Dave On Jun 11, 2007, at 11:45 AM, L Xu wrote: > I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > activeperl 5.8.8.819 Thank you very much. From johnsonm at gmail.com Mon Jun 11 20:45:13 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 11 Jun 2007 19:45:13 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) Message-ID: This bit in Bio::SeqFeature::Gene::Exon is causing me some problems trying to extend Bio::Tools::Glimmer to handle 'wraparound' genes (circular genomes): sub location { my ($self,$value) = @_; if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) { $self->throw("split or compound location is not allowed ". "for an object of type " . ref($self)); } return $self->SUPER::location($value); } That seems to be there all the way back to the initial revision (checked in by Hilmar). I presume it's there because of code like this ( from the seq() method in Bio::SeqFeature::Generic): # assumming our seq object is sensible, it should not have to yank # the entire sequence out here. my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); That's not going to work too well with a feature that has a Bio::Location::Split location. Fixing it up seems straightforward, if a bit hackish. Something like: my $seq; if (ref($self->location()) eq 'Bio::Location::Split')) { my $seqstring; my @sublocs = $self->location()->sub_Location(); foreach my $subloc (@sublocs) { $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), $subloc->end())->seq(); } my $seq = Bio::Seq->new( -id => $self->{'_gsf_seq'}->display_id(), -seq => $seqstring ); } else { $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); } I don't see any companion to trunc() in Bio::PrimarySeqI for joining sequences. A join() would be handy, and make the above cleaner. Comments, suggestions, rotten fruit? From torsten.seemann at infotech.monash.edu.au Tue Jun 12 02:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 12 Jun 2007 16:18:27 +1000 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: Mark, > if (ref($self->location()) eq 'Bio::Location::Split')) { > my $seqstring; > my @sublocs = $self->location()->sub_Location(); > > foreach my $subloc (@sublocs) { > $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), > $subloc->end())->seq(); > } Can you use the ->spliced_seq() method to do this? http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From pengchy at yahoo.com.cn Tue Jun 12 03:00:46 2007 From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=) Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST) Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com> hi all, Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141 Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, < DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. shell returned 2 when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond: TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr x/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. Is anyone else meet the same problem? Is it a bug for TFBS package? Best wishes! Sincerely, Pengcheng --------------------------------- ????????????????3.5G??????20M?????? From bix at sendu.me.uk Tue Jun 12 03:32:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 12 Jun 2007 08:32:02 +0100 Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com> References: <66745.92089.qm@web15205.mail.cnb.yahoo.com> Message-ID: <466E4BF2.7020504@sendu.me.uk> ? ?? wrote: > hi all, > > Today, I download the TFBS package from > http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the > files contained in the TFBS and Ext directories to directory > "C:\perl\site\lib", then put Ext under the TFBS directory. I run the > example script1.pl, but a wrong message respond: > > Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC You have to follow the installation instructions in the README file. Copying the files out is insufficient - you have to 'make'. From ryanx07 at hotmail.com Tue Jun 12 07:30:09 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 06:30:09 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Message-ID: Here is the code: use Bio::Perl; $seq_object = get_sequence('swiss',"ROA1_HUMAN"); print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; write_sequence(">roa1.fasta",'fasta',$seq_object); The output looks like the same as the previous version: Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\~Scripts>perl test.pl ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: test.pl:7 ----------------------------------------------------------- Thanks. >From: David Messina >To: L Xu >CC: BioPerl list >Subject: Re: [Bioperl-l] basic questions >Date: Mon, 11 Jun 2007 13:48:23 -0500 > >Hi, > >Please use 'Reply All' so everyone on the list can follow the discussion. > >Try adding the following line after the line that starts with $seq_object: > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > >And then run the program again. What do you get? Could you post a complete >printout of what you're doing? > > >Dave > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and >>activeperl 5.8.8.819 Thank you very much. > _________________________________________________________________ Picture this ? share your photos and you could win big! http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us From pengchy at yahoo.com.cn Tue Jun 12 10:33:15 2007 From: pengchy at yahoo.com.cn (Pengcheng Yang) Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?= In-Reply-To: Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com> I got the same questions. I guess that the swissprote database has some problems! code: use Bio::DB::SwissProt; $sp = new Bio::DB::SwissProt; $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" the mesage: ------------- EXCEPTION ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180 STACK Bio::DB::WebDBSeqI::get_Seq_by_id C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154 STACK toplevel t.pl:7 -------------------------------------- --- L Xu ????: > Here is the code: > > use Bio::Perl; > $seq_object = get_sequence('swiss',"ROA1_HUMAN"); > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > write_sequence(">roa1.fasta",'fasta',$seq_object); > > The output looks like the same as the previous version: > > Microsoft Windows XP [Version 5.1.2600] > (C) Copyright 1985-2001 Microsoft Corp. > > C:\~Scripts>perl test.pl > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK: Error::throw > STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 > STACK: Bio::SeqIO::swiss::next_seq > C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id > C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 > 3 > STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 > STACK: test.pl:7 > ----------------------------------------------------------- > > Thanks. > > > > > > >From: David Messina > >To: L Xu > >CC: BioPerl list > >Subject: Re: [Bioperl-l] basic questions > >Date: Mon, 11 Jun 2007 13:48:23 -0500 > > > >Hi, > > > >Please use 'Reply All' so everyone on the list can follow the > discussion. > > > >Try adding the following line after the line that starts with > $seq_object: > > > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > > > >And then run the program again. What do you get? Could you post a > complete > >printout of what you're doing? > > > > > >Dave > > > > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: > >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > >>activeperl 5.8.8.819 Thank you very much. > > > > _________________________________________________________________ > Picture this ?share your photos and you could win big! > http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Best wishes! Sincerely, Pengcheng ___________________________________________________________ ????????????????3.5G??????20M?????? http://cn.mail.yahoo.com From drummike at gmail.com Tue Jun 12 11:49:36 2007 From: drummike at gmail.com (Mike Williams) Date: Tue, 12 Jun 2007 11:49:36 -0400 Subject: [Bioperl-l] =?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?= In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com> References: <936780.8655.qm@web15215.mail.cnb.yahoo.com> Message-ID: On 6/12/07, Pengcheng Yang wrote: > I got the same questions. > I guess that the swissprote database has some problems! > code: > use Bio::DB::SwissProt; > $sp = new Bio::DB::SwissProt; > $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); > print ref($seq),"\t",$seq->display_id,"\n" > ------------- EXCEPTION ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK toplevel t.pl:7 This is a different problem. The id was not valid. If you change KPY1 to KPYK1 it works fine. $seq = $sp->get_Seq_by_id('KPYK1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" [mike at Wheatley]$ ./bio_quest2.pl Bio::Seq::RichSeq KPYK1_ECOLI If you got this example from the bio perl site would you please post the url? Seems to me this same problem has come up before, but I could not find it in the archives nor on the web site. Mike From ryanx07 at hotmail.com Tue Jun 12 11:42:28 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 10:42:28 -0500 Subject: [Bioperl-l] basic questions Message-ID: I tested another code (the 2nd test on the same machine) from the tutorial and got error again. I don't know what happened and please help. Thanks so much. ===========================================================Code: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection; my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; # prints name, recognition site, overhang } =========================================== Results: C:\~Scripts>perl t9.pl Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while "stric t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 236. = = = Original message = = = On Jun 11, 2007, at 11:45 AM, L Xu wrote: I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? activeperl 5.8.8.819 Thank you very much. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Need a break? Find your escape route with Live Search Maps. http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01 From limericksean at gmail.com Tue Jun 12 12:04:40 2007 From: limericksean at gmail.com (Sean O'Keeffe) Date: Tue, 12 Jun 2007 18:04:40 +0200 Subject: [Bioperl-l] gff2xml Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Hi all, I posted this on the gbrowse list earlier. I'm looking to convert gff data files into xml. Does anyone know of a module written to do this already? respect, sean. From johnsonm at gmail.com Tue Jun 12 12:10:45 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:10:45 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On 6/12/07, Torsten Seemann wrote: > Can you use the ->spliced_seq() method to do this? > > http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > --Tel +61 3 9905 9010 Actually, I'd forgotten about spliced_seq(). That seems like it will Do The Right Thing. It's just up to the invoker to call spliced_seq() instead of seq() as appropriate. So, is there any other code that will break if I modify Bio::SeqFeature::Gene::Exon::location to not throw an exception when encountering Bio::Location::SplitLocationI? I'm wondering if it's just a paranoid check or if it's there to guard against something. If the latter, I need to know what code to fix. I'll dig and look, but if anybody knows or has an idea, save me some time. I suppose I can just change it and see what tests start failing. 8) From dmessina at wustl.edu Tue Jun 12 12:11:36 2007 From: dmessina at wustl.edu (David Messina) Date: Tue, 12 Jun 2007 11:11:36 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu> Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps Perl wasn't seeing the second argument to get_sequence. And then your new program has the error 'Can't use string ("Bio::Restriction::EnzymeCollecti")' where the end of the word is cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks. Are there any example scripts that come with ActivePerl? If there are, and they run correctly, perhaps you could look to see how the line breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem -- anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl and make sure that you run the full test suite and that all of the tests pass. My guess is that something in your current setup is not quite right. Dave From cjfields at uiuc.edu Tue Jun 12 12:42:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 11:42:29 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs state that the Exon class is used to specifically describe exons, as the name implies. Exons are primarily eukaryotic in origin, so you shouldn't encounter wraparounds, and should not have split locations by definition (which likely explains the exception). Wouldn't a SeqFeature::Generic work just as well using a split location? chris From johnsonm at gmail.com Tue Jun 12 12:59:54 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:59:54 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: That's a good point. Both Bio::Tools::Glimmer and Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with a single Bio::SeqFeature::Gene::Exon, when parsing predictions for prokaryotic sequence (multiple exons for eukaryotic). There are eukaryotic and prokaryotic versions of both predictor families. Maybe the most elegant solution would be to simply modify both modules to only emit Bio::SeqFeature::Generic features when operating on prokaryotic mode output? Fix the data model and the problem goes away. 8) On 6/12/07, Chris Fields wrote: > > On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > > > On 6/12/07, Torsten Seemann > > wrote: > >> Can you use the ->spliced_seq() method to do this? > >> > >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ > >> SeqFeatureI.html#POD11 > >> > >> -- > >> --Torsten Seemann > >> --Victorian Bioinformatics Consortium, Monash University > >> --Tel +61 3 9905 9010 > > > > Actually, I'd forgotten about spliced_seq(). That seems like it > > will Do The Right Thing. It's just up to the invoker to call > > spliced_seq() instead of seq() as appropriate. > > So, is there any other code that will break if I modify > > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > > encountering Bio::Location::SplitLocationI? I'm wondering if it's > > just a paranoid check or if it's there to guard against something. If > > the latter, I need to know what code to fix. I'll dig and look, but > > if anybody knows or has an idea, save me some time. I suppose I can > > just change it and see what tests start failing. 8) > > I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to > describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs > state that the Exon class is used to specifically describe exons, as > the name implies. Exons are primarily eukaryotic in origin, so you > shouldn't encounter wraparounds, and should not have split locations > by definition (which likely explains the exception). > > Wouldn't a SeqFeature::Generic work just as well using a split location? > > chris > From ryanx07 at hotmail.com Tue Jun 12 13:17:18 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 12:17:18 -0500 Subject: [Bioperl-l] basic questions Message-ID: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820. However, both scripts generated the same error with my computer. I tested the code in another WinXP computer with the same versions of activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there any example scripts that come with ActivePerl? If there are,? and they run correctly, perhaps you could look to see how the line? breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl? and make sure that you run the full test suite and that all of the? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 13:51:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 12:51:47 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: This is an instance where 'use strict' would have shown the problem right away. You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: > I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 > build 820. > However, both scripts generated the same error with my computer. I > tested > the code in another WinXP computer with the same versions of > activePerl and > BioPerl, the one for the swissprot did work but the restriction enzyme > generated the same error. > > = = = Original message = = = > > Hmm, it almost looks like you're having an issue with line breaks. > > The 'swissprot stream with no ID' error made me think that perhaps? > Perl > wasn't seeing the second argument to get_sequence. And then your? new > program has the error 'Can't use string? > ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? > cut off. > > I don't know how ActivePerl handles Windows vs UNIX line breaks.? > Are? there > any example scripts that come with ActivePerl? If there are,? and > they run > correctly, perhaps you could look to see how the line? breaks are > done and > make sure the your program does it the same way. > > Other than that, I'm not seeing an obvious answer to your problem > --? anyone > else have a suggestion? > > Perhaps the easiest thing for you to do would be to reinstall > BioPerl? and > make sure that you run the full test suite and that all of the? > tests pass. > My guess is that something in your current setup is not? quite right. > > Dave > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only > on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Tue Jun 12 14:11:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 13:11:15 -0500 Subject: [Bioperl-l] basic questions Message-ID: Thank you very much, it did make the script advanced a bit but I got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the package. Thanks. = = = Original message = = = This is an instance where 'use strict' would have shown the problem? right away.? You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 820. However, both scripts generated the same error with my computer. I? tested the code in another WinXP computer with the same versions of? activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps?? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? there any example scripts that come with ActivePerl? If there are,? and? they run correctly, perhaps you could look to see how the line? breaks are? done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem? --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and make sure that you run the full test suite and that all of the?? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only? on MSN http://liveearth.msn.com?source=msntaglineliveearthhm _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 14:35:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 13:35:15 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu> Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme objects, each with its own name(). Using grouped methods like '$collection->cutters(6)' will retrieve a new EnzymeCollection containing all six-cutters from the original collection. You should use one of the EnzymeCollection accessor methods to retrieve the enzyme that you wanted first or iterate through them all. This works for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; } chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: > Thank you very much, it did make the script advanced a bit but I > got the following error: > > C:\~Scripts>perl t9.pl > Can't locate object method "name" via package > "Bio::Restriction::EnzymeCollectio > n" at t9.pl line 5, line 532. > > I checked the documentation , there is no "name" method for the > package. Thanks. From johnsonm at gmail.com Tue Jun 12 15:07:57 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 14:07:57 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: I'll wait a day, and if there is no opinion to the contrary, implement it this way. On 6/12/07, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) From torsten.seemann at infotech.monash.edu.au Tue Jun 12 20:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 13 Jun 2007 10:18:27 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: Sean > I posted this on the gbrowse list earlier. I'm looking to convert gff > data files into xml. Does anyone know of a module written to do this > already? What DTD do you want the XML to conform to? eg. ChadoXML, TinySeq XML, TIGR XML ... ? What program are you trying to get to load the XML? BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that you could use. There is a script "bp_seqconvert.pl -h" which comes with BioPerl which may be useful. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From hlapp at gmx.net Tue Jun 12 20:55:57 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:55:57 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net> I think it was just trying to guard against people trying to do stupid things. I'm actually not sure that representing locations on a circular genome using split locations really is the best thing. I'm wondering whether one shouldn't rather introduce a CircularLocation object (though obviously it isn't the location that's circular...). Just a thought. In the end, if you have a way to make this work that you feel comfortable with than go for it. -hilmar On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Jun 12 20:57:06 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:57:06 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> I like that. Don't force a model to do what you want if it doesn't really apply anyway. -hilmar On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) > > On 6/12/07, Chris Fields wrote: >> >> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >> >>> On 6/12/07, Torsten Seemann >>> wrote: >>>> Can you use the ->spliced_seq() method to do this? >>>> >>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>> SeqFeatureI.html#POD11 >>>> >>>> -- >>>> --Torsten Seemann >>>> --Victorian Bioinformatics Consortium, Monash University >>>> --Tel +61 3 9905 9010 >>> >>> Actually, I'd forgotten about spliced_seq(). That seems like it >>> will Do The Right Thing. It's just up to the invoker to call >>> spliced_seq() instead of seq() as appropriate. >>> So, is there any other code that will break if I modify >>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when >>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>> just a paranoid check or if it's there to guard against >>> something. If >>> the latter, I need to know what code to fix. I'll dig and look, but >>> if anybody knows or has an idea, save me some time. I suppose I can >>> just change it and see what tests start failing. 8) >> >> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >> state that the Exon class is used to specifically describe exons, as >> the name implies. Exons are primarily eukaryotic in origin, so you >> shouldn't encounter wraparounds, and should not have split locations >> by definition (which likely explains the exception). >> >> Wouldn't a SeqFeature::Generic work just as well using a split >> location? >> >> chris >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Jun 12 21:20:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 20:20:41 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> References: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu> It will be interesting to see if bioperl handles wrap-around split locations via spliced_seq() and other methods. I can't see why it wouldn't but one never knows. Might be something to add to location tests at some point... chris On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote: > I like that. Don't force a model to do what you want if it doesn't > really apply anyway. > > -hilmar > > On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > >> That's a good point. Both Bio::Tools::Glimmer and >> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with >> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for >> prokaryotic sequence (multiple exons for eukaryotic). There are >> eukaryotic and prokaryotic versions of both predictor families. >> Maybe >> the most elegant solution would be to simply modify both modules to >> only emit Bio::SeqFeature::Generic features when operating on >> prokaryotic mode output? Fix the data model and the problem goes >> away. 8) >> >> On 6/12/07, Chris Fields wrote: >>> >>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >>> >>>> On 6/12/07, Torsten Seemann >>>> wrote: >>>>> Can you use the ->spliced_seq() method to do this? >>>>> >>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>>> SeqFeatureI.html#POD11 >>>>> >>>>> -- >>>>> --Torsten Seemann >>>>> --Victorian Bioinformatics Consortium, Monash University >>>>> --Tel +61 3 9905 9010 >>>> >>>> Actually, I'd forgotten about spliced_seq(). That seems >>>> like it >>>> will Do The Right Thing. It's just up to the invoker to call >>>> spliced_seq() instead of seq() as appropriate. >>>> So, is there any other code that will break if I modify >>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception >>>> when >>>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>>> just a paranoid check or if it's there to guard against >>>> something. If >>>> the latter, I need to know what code to fix. I'll dig and look, >>>> but >>>> if anybody knows or has an idea, save me some time. I suppose I >>>> can >>>> just change it and see what tests start failing. 8) >>> >>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >>> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >>> state that the Exon class is used to specifically describe exons, as >>> the name implies. Exons are primarily eukaryotic in origin, so you >>> shouldn't encounter wraparounds, and should not have split locations >>> by definition (which likely explains the exception). >>> >>> Wouldn't a SeqFeature::Generic work just as well using a split >>> location? >>> >>> chris >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Wed Jun 13 08:16:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 07:16:15 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: Thanks so much, Chris, it works now. All the codes I tested were copied from Bioperl Tutorial. Why did they have such problems, because of the platform issue or different versions of BioPerl? I tested so far 6 scripts, three work and three don't. Here is the problem for the 3rd failed script: ================================= use strict; use Bio::Tools::Run::RemoteBlast; my $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); my $r = $remote_blast->submit_blast("d1.fa"); my $rc; while ( my @rids = $remote_blast->each_rid ) { for my $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); } } print "$rc\n"; #I just want to print sth here before parsing the result =========================================================d1.fa >example CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC =========================================================result C:\>perl t13.pl -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- Terminating on signal SIGINT(2) C:\> Please help me to correct the problem, thanks. = = = Original message = = = Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, each with its own name().? Using grouped methods like? '$collection->cutters(6)' will retrieve a new EnzymeCollection? containing all six-cutters from the original collection.? You should? use one of the EnzymeCollection accessor methods to retrieve the? enzyme that you wanted first or iterate through them all.? This works? for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme) ?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: Thank you very much, it did make the script advanced a bit but I? got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package? "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the? package. Thanks. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Make every IM count. Download Messenger and join the i?m Initiative now. It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07 From cjfields at uiuc.edu Wed Jun 13 10:41:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 09:41:55 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu> Judging by the output it looks like you have no network access or can't connect to the server (what remoteblast needs). Make sure you don't need proxy settings. To preempt the next question, no, I'm not going to explain what a proxy is. The RemoteBlast docs show how to set them, and Google is a wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... From ryanx07 at hotmail.com Wed Jun 13 11:01:07 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 10:01:07 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: I do have the internet connection bu not use the proxy server. I tested the network connection with ping command (below). The ncbi website does not response. Is there any special network setting needed for connecting the ncbi website? Thank you so much. C:\>ping www.yahoo.com Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 Ping statistics for 69.147.114.210: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 312ms, Maximum = 363ms, Average = 338ms C:\>ping www.ncbi.nlm.nih.gov Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: Request timed out. Request timed out. Request timed out. Request timed out. Ping statistics for 130.14.29.110: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), = = = Original message = = = Judging by the output it looks like you have no network access or? can't connect to the server (what remoteblast needs).? Make sure you? don't need proxy settings. To preempt the next question, no, I'm not going to explain what a? proxy is.? The RemoteBlast docs show how to set them, and Google is a? wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: ... -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- ... ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Wed Jun 13 12:14:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 11:14:22 -0500 Subject: [Bioperl-l] method naming Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Some quick questions on method naming. I couldn't find this on the mail list previously and just want some opinions. 1) Is there any preference on how to name a method that returns a list of class instances vs. data? I have seen 'each' (each_Location, each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. simple (hits, hsps). 2) Do we want have methods which return objects have the object name in Title Case (each_Location, get_Seq_by_id, etc) or does it really matter? chris From dmessina at wustl.edu Wed Jun 13 12:41:53 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 13 Jun 2007 11:41:53 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). I'd prefer 'get_all' because it's more intuitive to me what the method is doing. 'Each' is too programmer-y. > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? I like Title Case because it reinforces the notion that what you're getting back is a specific object with that name (Seq) rather than the generic thing that the name represents (AGTCTGTGATAT, the actual sequence as a string). Dave From hlapp at gmx.net Wed Jun 13 13:03:59 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 13:03:59 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> We set a convention a while back on how to name these. It is implemented in the bioperl.lisp file (too bad no one is using emacs any more these days - it's a great editor), and in fact we started a renaming campaign (not sure when that was) on the SeqI and SeqFeatureI classes (you'll still see the old names aliased). However, we never got to finish the clean up. The convention was to use get_{ClassName}s, and get_all_{ClassName}s if there is a difference to the former (mostly because of hierarchical data; for example features can be nested, and get_all_SeqFeatures returns them all flattened out, while get_SeqFeatures returns only the top objects), and for modifying add_ {ClassName} and remove_{ClassName}s. The class name was to be in title case to emphasize the fact that it is an array of object you'd be getting back (and what kind of objects). If it is strings or any other scalar type, the name would be in lower case. -hilmar On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > Some quick questions on method naming. I couldn't find this on the > mail list previously and just want some opinions. > > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). > > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 13:19:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 12:19:43 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: Sounds good. I agree with Dave also one the use of 'each', as it's a bit ambiguous (seems to imply iteration as opposed to returning a whole list). We probably need to post this somewhere on the wiki for future reference; maybe in Advanced BioPerl? I'll add this in shortly. chris On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), and in fact we started > a renaming campaign (not sure when that was) on the SeqI and > SeqFeatureI classes (you'll still see the old names aliased). > > However, we never got to finish the clean up. > > The convention was to use get_{ClassName}s, and get_all_{ClassName} > s if there is a difference to the former (mostly because of > hierarchical data; for example features can be nested, and > get_all_SeqFeatures returns them all flattened out, while > get_SeqFeatures returns only the top objects), and for modifying > add_{ClassName} and remove_{ClassName}s. > > The class name was to be in title case to emphasize the fact that > it is an array of object you'd be getting back (and what kind of > objects). If it is strings or any other scalar type, the name would > be in lower case. > > -hilmar > > On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > >> Some quick questions on method naming. I couldn't find this on the >> mail list previously and just want some opinions. >> >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> >> 2) Do we want have methods which return objects have the object name >> in Title Case (each_Location, get_Seq_by_id, etc) or does it really >> matter? >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Jun 13 14:43:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 13:43:41 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <467036FC.8000505@watson.wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> <467036FC.8000505@watson.wustl.edu> Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu> On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote: > > David Messina wrote: >>> 1) Is there any preference on how to name a method that returns a >>> list of class instances vs. data? I have seen >>> 'each' (each_Location, >>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) >>> vs. >>> simple (hits, hsps). >>> >> >> I'd prefer 'get_all' because it's more intuitive to me what the >> method is doing. 'Each' is too programmer-y. >> >> >> > When I think 'get_all', I think of a method that returns a list of > objects at once. When I think of 'each', I think of a method that > returns a scalar but can be called multiple times to iterate over a > set of objects. Yep, hence the ambiguity issue (and my confusion). I think it was so you could both iterate and return a list using this: for my $obj ($seq->each_Class) {...} my @objs = $seq->each_Class; I use 'next' and 'get/get_all' as an iterator and get accessor (similar to how it's used in Bio::SearchIO): while (my $obj = $seq->next_Class) {...} my @objs = $seq->get_Class; # or get_all_Class for flattened lists which to me is much clearer. chris From mkiwala at watson.wustl.edu Wed Jun 13 14:27:08 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Wed, 13 Jun 2007 13:27:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> Message-ID: <467036FC.8000505@watson.wustl.edu> David Messina wrote: >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> > > I'd prefer 'get_all' because it's more intuitive to me what the > method is doing. 'Each' is too programmer-y. > > > When I think 'get_all', I think of a method that returns a list of objects at once. When I think of 'each', I think of a method that returns a scalar but can be called multiple times to iterate over a set of objects. From sac at bioperl.org Wed Jun 13 17:17:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 13 Jun 2007 14:17:27 -0700 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> On 6/13/07, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we could improve the visibility of bioperl.lisp. In truth, I had forgotten about it, though lit turns out I was loading an old version of it. (Btw, using the latest version of bioperl.lisp with xemacs 21.4.17, I don't get a bioperl menu item, though I can access bioperl functions via M-x. Suggestions?) I see bioperl.lisp is mentioned twice parenthetically in the advanced bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here would help. While we're at it, maybe we could add a bioperl.vi file to the distribution (if you can do such things with vi/vim). On 6/13/07, Chris Fields wrote: > We probably need to post this somewhere on the wiki for future > reference; maybe in Advanced BioPerl? I'll add this in shortly. Another idea: Add a method naming check to the set of audits we perform on CVS committed code. It could check for agreement with our conventions and warn if nothing was found (may not be a problem though). Steve From arareko at campus.iztacala.unam.mx Wed Jun 13 18:03:34 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 13 Jun 2007 17:03:34 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <467069B6.7080003@campus.iztacala.unam.mx> By the time of the 1.5.2 release, I jumped onto the idea of creating a BioPerl template for Komodo. Chris F handed me one he had already made but in the end I didn't had enough spare time to get into it. If someone wants to give it a try please let ChrisF/me know. Regards, Mauricio. Steve Chervitz wrote: > On 6/13/07, Hilmar Lapp wrote: >> We set a convention a while back on how to name these. It is >> implemented in the bioperl.lisp file (too bad no one is using emacs >> any more these days - it's a great editor), > > As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we > could improve the visibility of bioperl.lisp. In truth, I had > forgotten about it, though lit turns out I was loading an old version > of it. (Btw, using the latest version of bioperl.lisp with xemacs > 21.4.17, I don't get a bioperl menu item, though I can access bioperl > functions via M-x. Suggestions?) > > I see bioperl.lisp is mentioned twice parenthetically in the advanced > bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here > would help. While we're at it, maybe we could add a bioperl.vi file to > the distribution (if you can do such things with vi/vim). > > On 6/13/07, Chris Fields wrote: >> We probably need to post this somewhere on the wiki for future >> reference; maybe in Advanced BioPerl? I'll add this in shortly. > > Another idea: Add a method naming check to the set of audits we > perform on CVS committed code. It could check for agreement with our > conventions and warn if nothing was found (may not be a problem > though). > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From hlapp at gmx.net Wed Jun 13 18:41:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 18:41:45 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > using the latest version of bioperl.lisp with xemacs 21.4.17, I > don't get a bioperl menu item I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item it showing up just beautifully. (BTW it also have very nice icons for various functions - though I always feel guilty for using keystrokes instead.) Is GNU Emacs finally winning this? ;) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Wed Jun 13 18:58:51 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 15:58:51 -0700 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Post your dualing screenshots to the wiki! I had started a couple of IDE pages on the wiki a while ago: http://bioperl.org/wiki/Emacs http://bioperl.org/wiki/Emacs_template http://bioperl.org/wiki/Vi If anyone is feeling excited enough to write a few more IDE pages and link them into a common article that would be great. -jason On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > > On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > >> using the latest version of bioperl.lisp with xemacs 21.4.17, I >> don't get a bioperl menu item > > I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item > it showing up just beautifully. (BTW it also have very nice icons for > various functions - though I always feel guilty for using keystrokes > instead.) > > Is GNU Emacs finally winning this? ;) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Wed Jun 13 19:08:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:08:17 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: Would probably be worth writing one up for Komodo since Mauricio, Sendu, and I use it. I updated the Advanced BioPerl page with Hilmar's methods suggestions/ rules (as well as a few I found dating back a number of years on the mail list). It might be worth a glance in case there are any changes needed: http://www.bioperl.org/wiki/Advanced_BioPerl chris On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > Post your dualing screenshots to the wiki! > > I had started a couple of IDE pages on the wiki a while ago: > http://bioperl.org/wiki/Emacs > http://bioperl.org/wiki/Emacs_template > http://bioperl.org/wiki/Vi > > If anyone is feeling excited enough to write a few more IDE pages > and link them into a common article that would be great. > > -jason > On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > >> >> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >> >>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>> don't get a bioperl menu item >> >> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item >> it showing up just beautifully. (BTW it also have very nice icons for >> various functions - though I always feel guilty for using keystrokes >> instead.) >> >> Is GNU Emacs finally winning this? ;) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 13 19:28:17 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 19:28:17 -0400 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Thanks Chris for doing this - looks great. The only comment that I have is that method names should never start with a capital letter. If the getter/setter is for a single object (as opposed to a list), the name should probably be similar (if not identical) to the class being expected and returned, but lower-case. E.g., $feature->location(), $seq->species() etc -hilmar On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > Would probably be worth writing one up for Komodo since Mauricio, > Sendu, and I use it. > > I updated the Advanced BioPerl page with Hilmar's methods > suggestions/rules (as well as a few I found dating back a number of > years on the mail list). It might be worth a glance in case there > are any changes needed: > > http://www.bioperl.org/wiki/Advanced_BioPerl > > chris > > On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > >> Post your dualing screenshots to the wiki! >> >> I had started a couple of IDE pages on the wiki a while ago: >> http://bioperl.org/wiki/Emacs >> http://bioperl.org/wiki/Emacs_template >> http://bioperl.org/wiki/Vi >> >> If anyone is feeling excited enough to write a few more IDE pages >> and link them into a common article that would be great. >> >> -jason >> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: >> >>> >>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >>> >>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>>> don't get a bioperl menu item >>> >>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu >>> item >>> it showing up just beautifully. (BTW it also have very nice icons >>> for >>> various functions - though I always feel guilty for using keystrokes >>> instead.) >>> >>> Is GNU Emacs finally winning this? ;) >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 19:44:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:44:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu> Agreed. We can definitely add that in. As we edge towards another release we try another round of cleaning up. I wouldn't mind pushing out another 1.5 point release before summer's up if possible; most of the tough work was done for v.1.5.2 by Sendu. chris On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote: > Thanks Chris for doing this - looks great. The only comment that I > have is that method names should never start with a capital letter. > If the getter/setter is for a single object (as opposed to a list), > the name should probably be similar (if not identical) to the class > being expected and returned, but lower-case. > > E.g., $feature->location(), $seq->species() etc > > -hilmar > > On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > >> Would probably be worth writing one up for Komodo since Mauricio, >> Sendu, and I use it. >> >> I updated the Advanced BioPerl page with Hilmar's methods >> suggestions/rules (as well as a few I found dating back a number of >> years on the mail list). It might be worth a glance in case there >> are any changes needed: >> >> http://www.bioperl.org/wiki/Advanced_BioPerl >> >> chris ... From johncumbers at gmail.com Wed Jun 13 20:20:42 2007 From: johncumbers at gmail.com (John Cumbers) Date: Wed, 13 Jun 2007 20:20:42 -0400 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? Message-ID: Hello, I have a simple problem, I'm trying to search a genome sequence for a motif, I then want to output a BED file to display all the locations of this motif on the UCSC Genome Browser. I could not find a script to do this, so I started to write my own. I'm new to perl and my code below was my attempt to read the sequence string and output the index bp of the start of each motif. With this I could build the BED file myself, which requires start and finish base pairs. For the first motif I can output the start index, but when I try and read the next one off the sequence it does not work. Instead I just get an output of a list of 1's. I realise that this is more a request for some simple perl help, but any help much appreciated. Best wishes, John $seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta"); #turn my FASTA file into a seq object. $sequence_as_a_string = $seq_object->seq(); #turn it into a string # search $sequence_as_a_string string for motif AAA as example # if found, return the index that it is found at while ($sequence_as_a_string =~ m/AAA/g) { print "Found '$&'. Next attempt at character " . pos($sequence_as_a_string)+1 . "\n"; } -- John Cumbers, Graduate Student Biology and Medicine Brown University, Box G-W Providence, Rhode Island, 02912, USA Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 UK to USA: 0207 617 7824 From cjfields at uiuc.edu Wed Jun 13 21:58:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 20:58:37 -0500 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: References: Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> This is answered in the FAQ (sorry if the URL wraps, but we don't like tinyurls): http://www.bioperl.org/wiki/ FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F chris On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > Hello, > > I have a simple problem, I'm trying to search a genome sequence for > a motif, > I then want to output a BED file to display all the locations of > this motif > on the UCSC Genome Browser. I could not find a script to do this, > so I > started to write my own. I'm new to perl and my code below was my > attempt > to read the sequence string and output the index bp of the start of > each > motif. With this I could build the BED file myself, which requires > start > and finish base pairs. > > For the first motif I can output the start index, but when I try > and read > the next one off the sequence it does not work. Instead I just get an > output of a list of 1's. I realise that this is more a request for > some > simple perl help, but any help much appreciated. > > Best wishes, > John > > > $seq_object = read_sequence > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > my FASTA file into a seq object. > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > # search $sequence_as_a_string string for motif AAA as example > # if found, return the index that it is found at > > while ($sequence_as_a_string =~ m/AAA/g) { > print "Found '$&'. Next attempt at character " . > pos($sequence_as_a_string)+1 . "\n"; > } > > > > -- > John Cumbers, Graduate Student > Biology and Medicine > Brown University, Box G-W > Providence, Rhode Island, 02912, USA > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > UK to USA: 0207 617 7824 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Jun 14 00:08:04 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 21:08:04 -0700 Subject: [Bioperl-l] wiki bulk update Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org> I did a some bulk update of Module pages for new modules that had been created since we last setup these pages: I outlined a little bit of what it requires behind the scenes. http://bioperl.org/wiki/BioPerl:Module_pages -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From bix at sendu.me.uk Thu Jun 14 05:35:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 10:35:00 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() Message-ID: <46710BC4.3060302@sendu.me.uk> It is preferable to have ->new syntax over new Object syntax, as outlined here: http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules I propose making this syntax change in all Bioperl POD documentation, so that the bad syntax is no longer suggested/encouraged. Any objections? If not, I'll go ahead and commit the changes. (affects 907 modules in live) Cheers, Sendu. From bix at sendu.me.uk Thu Jun 14 06:01:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 11:01:02 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <467111DE.6060800@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > > I propose making this syntax change in all Bioperl POD documentation, Actually, I propose making the change to code as well. From hlapp at gmx.net Thu Jun 14 08:47:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 08:47:47 -0400 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net> Sounds fine to me. People do go by working examples, and I've seen inconsistent examples leading to confusion on the end of newbies. -hilmar On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Jun 14 08:55:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 07:55:18 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: Sounds fine by me. I may actually start tackling some of the feature/ annotation overloading stuff myself to see what happens (I'll drop a notice when that occurs). chris On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. From tanzeem.mb at gmail.com Thu Jun 14 02:27:19 2007 From: tanzeem.mb at gmail.com (tanzeem) Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT) Subject: [Bioperl-l] Problem working with remoteblast submit method in webbrowser. Message-ID: <11114623.post@talk.nabble.com> I have a program which uses the Bio perl remoteblast module which compares a aminoacid fasta file with swissprot database. The submit_blast() method works successfully when run from commandline.But when the program is run from web browser it returns -1. I was trying to adapt the code from Remoteblast synopsis for my need. -- View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bix at sendu.me.uk Thu Jun 14 11:34:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 16:34:27 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <46716003.2030302@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > I propose making this syntax change in all Bioperl POD documentation, so > that the bad syntax is no longer suggested/encouraged. Any objections? > If not, I'll go ahead and commit the changes. > > (affects 907 modules in live) It was actually 515 modules & test scripts from live, 48 from run, 21 from db and 2 from network. Now committed. Before and after my changes these were failing: Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioGraphics.t 3 768 38 3 3-5 t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 1932 2106 t/Sopma.t 2 512 16 2 8 15 t/genbank.t 2 512 247 2 122-123 BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 (unintentional?). Sopma may not be a bug: results from server might have changed. genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 -> 1.164 not doing what the new tests expect. PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are you working on that, or can I fix those errors? Anyone care to look into those things? Cheers, Sendu. From cjfields at uiuc.edu Thu Jun 14 12:35:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 11:35:21 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: The genbank commit was mine so I'll look into it; may be that I hadn't finished up the bug work. If if have time I'll look into Sopma as well (unless you get to it first). chris On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD >> documentation, so >> that the bad syntax is no longer suggested/encouraged. Any >> objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ---------------------------------------------------------------------- > --------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm > 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, > are > you working on that, or can I fix those errors? > > Anyone care to look into those things? > > Cheers, > Sendu. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Thu Jun 14 12:43:43 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:43:43 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <4671703F.4010109@sheffield.ac.uk> I'm just wondering if anyone passes their modules through perltidy in order for them to have the same look/feel? If so, do you have a .perltidyrc file? Also, is it worth running the Bioperl modules through it? Nath From n.haigh at sheffield.ac.uk Thu Jun 14 12:36:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:36:37 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <46716E95.3090604@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD documentation, so >> that the bad syntax is no longer suggested/encouraged. Any objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ------------------------------------------------------------------------------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are > you working on that, or can I fix those errors? > I can fix these - although I'm still trying to get my new Debian 4.0 system up-to-speed so it might take me a little while! RE the PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't installed. However, would it be better to have Test::Pod in t/lib so that it runs on the user's system during installation or leave it as is? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS 7olroF2e6+4I0biz6fWRmu4= =s3hK -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 14 13:15:24 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:15:24 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <4671703F.4010109@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> Message-ID: <467177AC.8060104@sendu.me.uk> Nathan S. Haigh wrote: > I'm just wondering if anyone passes their modules through perltidy in > order for them to have the same look/feel? If so, do you have a > .perltidyrc file? Also, is it worth running the Bioperl modules through it? I don't use it, but I was contemplating the same thing. Chris uses it from time to time and I think we have a similar taste in style. But we'd have to hammer something out that was agreeable to everyone. From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 13:19:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:19:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz> David Messina wrote: > Hi Martin, > > You're in luck -- the BioPerl core distribution includes two scripts > for doing just that: > > genbank2gff Somehow these scripts were not installed for me on Gentoo, but I have then in the cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database, or better to say I have no intent to install that unknown thing, seems like an overkill for my case. I just want to render a plasmid map. > genbank2gff3 This one seems more promising but still with current cvs checkout I get... $ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb # Input: stdin Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, line 125. $ $ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. ID unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP. XX AC unknown; XX XX XX CC ApEinfo:methylated:0 ... Oh dear, I have just manually edited the files and still they are wrong? Oh no. :( > > Look in the scripts directory of the distro. > > Also, there is a *huge* amount of documentation and examples on the > BioPerl website. > > http://www.bioperl.org/wiki/HOWTOs You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-) > > Reading those, reading the FAQ, and searching the mailing list > archives are where I look first when I don't know how to do something > in BioPerl. > > > Dave > > -- > Dave Messina > Senior Analyst, Assembly Group > Genome Sequencing Center > Washington University > St. Louis, MO > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 99.gb Url: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070614/fc6e601a/attachment.pl From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 13:23:28 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:23:28 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> Message-ID: <46717990.6040509@ribosome.natur.cuni.cz> Martin MOKREJ? wrote: >> Also, there is a *huge* amount of documentation and examples on the >> BioPerl website. >> >> http://www.bioperl.org/wiki/HOWTOs > > You mean > http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File > ? ;-) $ perl embl2picture.pl ~/99.gb | display - Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. $ The plasmid is a circular DNA, why is the diagram in linear? ;-) Martin From bix at sendu.me.uk Thu Jun 14 13:03:34 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:03:34 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716E95.3090604@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <46716E95.3090604@sheffield.ac.uk> Message-ID: <467174E6.1090001@sendu.me.uk> Nathan S. Haigh wrote: >> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are >> you working on that, or can I fix those errors? > > I can fix these - although I'm still trying to get my new Debian 4.0 > system up-to-speed so it might take me a little while! RE the > PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't > installed. However, would it be better to have Test::Pod in t/lib so > that it runs on the user's system during installation or leave it as is? Leave it as is. Every-day users don't need to check the syntax of the pod. In fact, it really only needs to be done once, prior to packaging up a new release. From n.haigh at sheffield.ac.uk Thu Jun 14 13:32:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:32:37 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46717BB5.8000706@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> I'm just wondering if anyone passes their modules through perltidy in >> order for them to have the same look/feel? If so, do you have a >> .perltidyrc file? Also, is it worth running the Bioperl modules >> through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. A starting place maybe Perl Best Practices by Damian Conway: http://www.oreilly.com/catalog/perlbp/ The perltidyrc file can e found here: http://www.perlmonks.org/?node_id=485885 I also found this nice thread with some ideas, inc some code that causes emacs to auto-perltidy everything you use cperl-mode with. I don't use emacs myself, ut here's the link if anyone is interested: http://www.perlmonks.org/?node_id=516501 Nath From johnsonm at gmail.com Thu Jun 14 13:38:31 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 12:38:31 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: The nice thing about Perl Tidy is that everybody can have their own config file. There could be a bioperl default config that gets applied at checkin time. Anybody that didn't like it could script checkouts to get run through their own config. Diffs might get a little hairy, but as long as you tidy before diffing, it shouldn't be too bad. Speaking of which....coding style is controversial enough, but since that's already been opened, what about CVS vs Subversion? 8) Some of the scripting for this sort of thing might be easer in Subversion. Though maybe something like Git would fit the developer model better (more support for distributed development). On 6/14/07, Sendu Bala wrote: > Nathan S. Haigh wrote: > > I'm just wondering if anyone passes their modules through perltidy in > > order for them to have the same look/feel? If so, do you have a > > .perltidyrc file? Also, is it worth running the Bioperl modules through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Thu Jun 14 13:39:39 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:39:39 +0100 Subject: [Bioperl-l] cvs changes in working copy Message-ID: <46717D5B.5040108@sheffield.ac.uk> Not sure if I'm being dense or if it's because I've been working with svn recently, but - how do I get a list of files that are different in my working copy compared to the repository? Cheers Nath From cjfields at uiuc.edu Thu Jun 14 13:46:38 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 12:46:38 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: Is 99.gb supposed to be a GenBank file? And you're loading it into embl2picture (which I assume takes EMBL format files)? Without example code we can easily make the wrong assumptions (i.e. that this is user error and not a BioPerl problem). Also, I don't believe the feature plotting scripts plot circular chromosomes/plasmids. If you want this functionality you'll have to code it for yourself. chris On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > Martin MOKREJ? wrote: > >>> Also, there is a *huge* amount of documentation and examples on the >>> BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> >> You mean >> http://www.bioperl.org/wiki/ >> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature > Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature > Bio::Location::Simple=HASH(0x893ebac): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature > Bio::Location::Simple=HASH(0x893e720): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature > Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > $ > > The plasmid is a circular DNA, why is the diagram in linear? ;-) > > Martin > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Jun 14 13:57:35 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 12:57:35 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <46717BB5.8000706@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk> Message-ID: <4671818F.5040902@campus.iztacala.unam.mx> I think a consensus .perltidyrc could be placed in the source distribution. Mauricio. Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. > > A starting place maybe Perl Best Practices by Damian Conway: > http://www.oreilly.com/catalog/perlbp/ > > > The perltidyrc file can e found here: > http://www.perlmonks.org/?node_id=485885 > > I also found this nice thread with some ideas, inc some code that causes > emacs to auto-perltidy everything you use cperl-mode with. I don't use > emacs myself, ut here's the link if anyone is interested: > http://www.perlmonks.org/?node_id=516501 > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Thu Jun 14 14:32:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 13:32:41 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: To chip in on this, I only use perltidy when I need to clean bioperl code up for debugging (particularly if blocks are hard to see) and just use the defaults. I agree it would be nice to have everything tidied up but it'll definitely need to be a consensus config file. About svn, I like the idea of eventually migrating to using it over CVS (I think BioPython and BioJava have plans to but I'm not sure) but I don't really know enough to say how feasible/difficult the migration path would be. Anyone know? chris On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > The nice thing about Perl Tidy is that everybody can have their > own config file. There could be a bioperl default config that gets > applied at checkin time. Anybody that didn't like it could script > checkouts to get run through their own config. Diffs might get a > little hairy, but as long as you tidy before diffing, it shouldn't be > too bad. Speaking of which....coding style is controversial enough, > but since that's already been opened, what about CVS vs Subversion? 8) > Some of the scripting for this sort of thing might be easer in > Subversion. Though maybe something like Git would fit the developer > model better (more support for distributed development). > > On 6/14/07, Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through >>> perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnsonm at gmail.com Thu Jun 14 14:46:24 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 13:46:24 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: If there was a default/standard/consensus bioperl perltidy config file, I would probably use it prior to checkin, on my own, so I could code in my schizophrenic style without worrying about starting any format wars. When I'm fixing or enhancing somebody else's code, I always try and adapt to whatever style they used, even if it grates on my nerves. I'd love to not have to worry about that with Bioperl. Of course, nobody will every agree on a standard, so it's probably a moot point. 8) On 6/14/07, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > > chris From jason at bioperl.org Thu Jun 14 15:00:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:00:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > Can we do any sort of massive conversion at some logical timepoint. Probably after a branch release or something? Because it basically means we're going to have differences on nearly every line which is going to make diff-ing difficult when debugging old/new versions. Maybe it is not a problem because we aren't introducing and new bugs! > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > It's doable but non-trivial. cvs2svn (python gah!) script exists to help in this. There are pros and cons to converting. There is a fair amount of documentation and other pointers out there that point to the CVS server for getting latest code so we'd need to think about whether we'd support some sort of backwards compatible SVN -> CVS for read-only or what. Mostly it will need someone to lead the charge - I made a go at doing it in the winter, but I really don't have the SVN-foo to make this work. We'd need someone with SVN experience to step up and help. You can always try and we can play with the converted repository for a while without making it the new code base. -j > chris > > On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > >> The nice thing about Perl Tidy is that everybody can have their >> own config file. There could be a bioperl default config that gets >> applied at checkin time. Anybody that didn't like it could script >> checkouts to get run through their own config. Diffs might get a >> little hairy, but as long as you tidy before diffing, it shouldn't be >> too bad. Speaking of which....coding style is controversial enough, >> but since that's already been opened, what about CVS vs >> Subversion? 8) >> Some of the scripting for this sort of thing might be easer in >> Subversion. Though maybe something like Git would fit the developer >> model better (more support for distributed development). >> >> On 6/14/07, Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> I'm just wondering if anyone passes their modules through >>>> perltidy in >>>> order for them to have the same look/feel? If so, do you have a >>>> .perltidyrc file? Also, is it worth running the Bioperl modules >>>> through it? >>> >>> I don't use it, but I was contemplating the same thing. Chris >>> uses it >>> from time to time and I think we have a similar taste in style. >>> >>> But we'd have to hammer something out that was agreeable to >>> everyone. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Thu Jun 14 15:01:27 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:01:27 -0700 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: <46717D5B.5040108@sheffield.ac.uk> References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: cvs update | grep '^M' On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > Not sure if I'm being dense or if it's because I've been working with > svn recently, but - how do I get a list of files that are different in > my working copy compared to the repository? > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Thu Jun 14 15:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 14:20:46 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > > On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > >> To chip in on this, I only use perltidy when I need to clean bioperl >> code up for debugging (particularly if blocks are hard to see) and >> just use the defaults. I agree it would be nice to have everything >> tidied up but it'll definitely need to be a consensus config file. >> > > Can we do any sort of massive conversion at some logical timepoint. > Probably after a branch release or something? Because it basically > means we're going to have differences on nearly every line which is > going to make diff-ing difficult when debugging old/new versions. > Maybe it is not a problem because we aren't introducing and new bugs! I agree; if we intend on doing this it should be all at once, maybe on a branch dedicated to ensure that code changes don't tank tests (they shouldn't but one never knows). We would then need a script up- and-running that tidies everything up prior to commits (though what happens if perltidy tanks?...). Sendu, up for it? >> About svn, I like the idea of eventually migrating to using it over >> CVS (I think BioPython and BioJava have plans to but I'm not sure) >> but I don't really know enough to say how feasible/difficult the >> migration path would be. Anyone know? >> > > It's doable but non-trivial. cvs2svn (python gah!) script exists to > help in this. There are pros and cons to converting. There is a > fair amount of documentation and other pointers out there that point > to the CVS server for getting latest code so we'd need to think about > whether we'd support some sort of backwards compatible SVN -> CVS for > read-only or what. > > Mostly it will need someone to lead the charge - I made a go at doing > it in the winter, but I really don't have the SVN-foo to make this > work. We'd need someone with SVN experience to step up and help. > You can always try and we can play with the converted repository for > a while without making it the new code base. > > -j Stepped into that one, didn't I! I'll look into how much effort is involved and try getting something going in the next month or two, maybe sooner if time permits. I'm lacking on SVN-foo as well but it might be worth looking into. chris From arareko at campus.iztacala.unam.mx Thu Jun 14 15:50:39 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 14:50:39 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> About svn, I like the idea of eventually migrating to using it over >>> CVS (I think BioPython and BioJava have plans to but I'm not sure) >>> but I don't really know enough to say how feasible/difficult the >>> migration path would be. Anyone know? >>> >> It's doable but non-trivial. cvs2svn (python gah!) script exists to >> help in this. There are pros and cons to converting. There is a >> fair amount of documentation and other pointers out there that point >> to the CVS server for getting latest code so we'd need to think about >> whether we'd support some sort of backwards compatible SVN -> CVS for >> read-only or what. >> >> Mostly it will need someone to lead the charge - I made a go at doing >> it in the winter, but I really don't have the SVN-foo to make this >> work. We'd need someone with SVN experience to step up and help. >> You can always try and we can play with the converted repository for >> a while without making it the new code base. >> >> -j > > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. > > chris > Chris D has worked with CVS-SVN transitioning for other projects, maybe he can shed some light on this. Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From sac at bioperl.org Thu Jun 14 17:33:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Thu, 14 Jun 2007 14:33:39 -0700 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> References: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com> This issue was discussed recently here. Check out this thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048 Some of the tools mentioned in the FAQ item Chris mentioned do not report where the match occurred, only that a match occurred (String::Approx, agrep), though some do report do report match locations (fuzznuc, fuzzprot; not sure about TFBS). My Bio::Tools::SeqPattern module does not even perform any matches, it just encapsulates a regular expression for a nuc or protein motif and knows how to handle ambiguity code expansion and reverse complementing. The idea is that you can use this to convert a biological sequence motif into a string suitable for use in a perl regex. Adding a match() method to this module would be handy. There an example script for it in examples/tools of the distro (which, btw references an obsolete module, so it won't run as is -- I'll fix). Steve On 6/13/07, Chris Fields wrote: > This is answered in the FAQ (sorry if the URL wraps, but we don't > like tinyurls): > > http://www.bioperl.org/wiki/ > FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. > 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F > > chris > > On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > > > Hello, > > > > I have a simple problem, I'm trying to search a genome sequence for > > a motif, > > I then want to output a BED file to display all the locations of > > this motif > > on the UCSC Genome Browser. I could not find a script to do this, > > so I > > started to write my own. I'm new to perl and my code below was my > > attempt > > to read the sequence string and output the index bp of the start of > > each > > motif. With this I could build the BED file myself, which requires > > start > > and finish base pairs. > > > > For the first motif I can output the start index, but when I try > > and read > > the next one off the sequence it does not work. Instead I just get an > > output of a list of 1's. I realise that this is more a request for > > some > > simple perl help, but any help much appreciated. > > > > Best wishes, > > John > > > > > > $seq_object = read_sequence > > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > > my FASTA file into a seq object. > > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > > # search $sequence_as_a_string string for motif AAA as example > > # if found, return the index that it is found at > > > > while ($sequence_as_a_string =~ m/AAA/g) { > > print "Found '$&'. Next attempt at character " . > > pos($sequence_as_a_string)+1 . "\n"; > > } > > > > > > > > -- > > John Cumbers, Graduate Student > > Biology and Medicine > > Brown University, Box G-W > > Providence, Rhode Island, 02912, USA > > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > > UK to USA: 0207 617 7824 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Thu Jun 14 19:04:11 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 19:04:11 -0400 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net> Actually, that will update your repository. If you just wanted to take a peek you would use cvs status: $ cvs status | grep 'Locally Modified' -hilmar On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote: > cvs update | grep '^M' > > On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > >> Not sure if I'm being dense or if it's because I've been working with >> svn recently, but - how do I get a list of files that are >> different in >> my working copy compared to the repository? >> >> Cheers >> Nath >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From mmokrejs at ribosome.natur.cuni.cz Fri Jun 15 03:28:17 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 15 Jun 2007 09:28:17 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <46723F91.60501@ribosome.natur.cuni.cz> Chris Fields wrote: > Is 99.gb supposed to be a GenBank file? And you're loading it into Yes, it was attached to the email. ;) > embl2picture (which I assume takes EMBL format files)? Without example > code we can easily make the wrong assumptions (i.e. that this is user > error and not a BioPerl problem). use constant USAGE =>< Render a GenBank/EMBL entry into drawable form. Return as a GIF or PNG image on standard output. File must be in embl, genbank, or another SeqIO- recognized format. Only the first entry will be rendered. Example to try: embl2picture.pl factor7.embl | display - END > > Also, I don't believe the feature plotting scripts plot circular > chromosomes/plasmids. If you want this functionality you'll have to > code it for yourself. That's a pitty it does not, but at least if someone could improve the docs. ;) Unfortunately I don't have the time to rewrite the code myself now, I need a working, standalone, already available tool. :( M. > > chris > > On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > >> Martin MOKREJ? wrote: >> >>>> Also, there is a *huge* amount of documentation and examples on the >>>> BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> >>> ? ;-) >> >> $ perl embl2picture.pl ~/99.gb | display - >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature >> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature >> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature >> Bio::Location::Simple=HASH(0x893e720): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature >> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> $ >> >> The plasmid is a circular DNA, why is the diagram in linear? ;-) >> >> Martin >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From dhoworth at mrc-lmb.cam.ac.uk Fri Jun 15 04:59:09 2007 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Fri, 15 Jun 2007 09:59:09 +0100 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk> Martin MOKREJ? wrote: >>> Also, there is a *huge* amount of documentation and examples on >>> the BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> You mean >> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - Error returned while > evaluating value of 'description' option for glyph > Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. Hmm an error at line 141 of a 69 line script? Methinks you're not actually running the script that's presented on the wiki page you quoted. I cut-and-pasted the script and your file and it worked for me (at least, it produced an image, along with a bunch of OOPS lines) HTH, Dave From n.haigh at sheffield.ac.uk Fri Jun 15 06:21:38 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:21:38 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726832.7080601@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a VPt4tEPLW2J+BiKnN3B8aV8= =c+9z -----END PGP SIGNATURE----- From bix at sendu.me.uk Fri Jun 15 06:07:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:07:04 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <467264C8.4020202@sendu.me.uk> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> To chip in on this, I only use perltidy when I need to clean bioperl >>> code up for debugging (particularly if blocks are hard to see) and >>> just use the defaults. I agree it would be nice to have everything >>> tidied up but it'll definitely need to be a consensus config file. >>> >> Can we do any sort of massive conversion at some logical timepoint. >> Probably after a branch release or something? Because it basically >> means we're going to have differences on nearly every line which is >> going to make diff-ing difficult when debugging old/new versions. >> Maybe it is not a problem because we aren't introducing and new bugs! Sorry, can you clarify the problem you envisage? And why would making a branch release help? > I agree; if we intend on doing this it should be all at once, maybe > on a branch dedicated to ensure that code changes don't tank tests > (they shouldn't but one never knows). We would then need a script up- > and-running that tidies everything up prior to commits (though what > happens if perltidy tanks?...). > > Sendu, up for it? If its going to be difficult and a hassle, for such an unnecessary thing I'm not sure its worth it. There are more pressing things to be done for Bioperl. If I can just run perltidy on the entire package and commit, I'd do it. If that's not appropriate, I won't. >>> About svn [snip] > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. I'd put this in the unnecessary-but-nice category as well. If it will be as easy as my ->new change, go ahead. If not, there are more pressing matters (POD fixing, test script updating and finishing...). From n.haigh at sheffield.ac.uk Fri Jun 15 06:35:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:35:40 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726B7C.7070902@sheffield.ac.uk> I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath From bix at sendu.me.uk Fri Jun 15 06:45:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:45:48 +0100 Subject: [Bioperl-l] Installation using --install_base In-Reply-To: <46726832.7080601@sheffield.ac.uk> References: <46726832.7080601@sheffield.ac.uk> Message-ID: <46726DDC.8090202@sendu.me.uk> Nathan S. Haigh wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I'm setting up a new installation of Debian 4.0 at home and though I'd > try to install BioPerl as a normal user rather than root. So in CPAN > options I set the --install_base to /home/username/perl and set PERL5LIB > to point to the same place. > > Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root > user and ask to install all optional modules, it tries to install them > through CPAN - however it seems to fail because some dependencies don't > seem to want to install in a user directory. > > Has anyone else found this or might I be doing something wrong? You'll need to configure CPAN to install into your user directory. Upgrade to the latest version, then go read the docs on the various configurable options. I thought I at least mentioned this in the Bioperl INSTALL doc. If not, can someone come up with a concise clarification? From sdavis2 at mail.nih.gov Fri Jun 15 06:56:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 06:56:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <46727048.3080904@mail.nih.gov> Sendu Bala wrote: > If its going to be difficult and a hassle, for such an unnecessary thing > I'm not sure its worth it. There are more pressing things to be done for > Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do it. > If that's not appropriate, I won't. I agree with the sentiment noted above. I'm a bit of an outsider here, but bioperl is a collaborative project. Not everyone has the same sentiments about what "correct" style means. As a programmer, I really wouldn't want significant changes on the style of my code. And perl happily puts up with many styles. I would say leave things as they are--let the individual programmers choose. It reduces the amount of work of questionable importance and allows the coding style freedom that perl supports. Just my $.02. Sean From cjfields at uiuc.edu Fri Jun 15 10:05:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:05:07 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> Message-ID: On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote: > Chris Fields wrote: >> Is 99.gb supposed to be a GenBank file? And you're loading it into > > Yes, it was attached to the email. ;) Sorry about that. I notice that '.' was added, but the spacing seemed off. I think bioperl catches that fine but it's something Wayne should consider. >> embl2picture (which I assume takes EMBL format files)? Without >> example >> code we can easily make the wrong assumptions (i.e. that this is user >> error and not a BioPerl problem). > > use constant USAGE =>< Usage: $0 > Render a GenBank/EMBL entry into drawable form. > Return as a GIF or PNG image on standard output. > > File must be in embl, genbank, or another SeqIO- > recognized format. Only the first entry will be > rendered. > > Example to try: > embl2picture.pl factor7.embl | display - > > END Horribly named script (should be seq2picture, since it converts both gb/embl). The use of 'all_tags' makes me think the script version you are using is old, as those methods have long since been renamed. Dave has it working though, so maybe your version has been updated? The 'use of initialized data in' errors are probably from inclusion of mandatory fields with no data or '.'. >> Also, I don't believe the feature plotting scripts plot circular >> chromosomes/plasmids. If you want this functionality you'll have to >> code it for yourself. > > That's a pitty it does not, but at least if someone could improve > the docs. ;) > Unfortunately I don't have the time to rewrite the code myself now, > I need a working, standalone, already available tool. :( > M. As I said, unless someone shows interest and codes it just won't get done. We have had very little interest in this, either b/c there are tools already out there to do this very thing (multitudes of plasmid drawing programs, some free like ApE) or that nobody's bothered to write it up. chris From cjfields at uiuc.edu Fri Jun 15 10:22:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:22:23 -0500 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <46727048.3080904@mail.nih.gov> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing >> I'm not sure its worth it. There are more pressing things to be >> done for >> Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd >> do it. >> If that's not appropriate, I won't. > > I agree with the sentiment noted above. I'm a bit of an outsider > here, > but bioperl is a collaborative project. Not everyone has the same > sentiments about what "correct" style means. As a programmer, I > really > wouldn't want significant changes on the style of my code. And perl > happily puts up with many styles. I would say leave things as they > are--let the individual programmers choose. It reduces the amount of > work of questionable importance and allows the coding style freedom > that > perl supports. > > Just my $.02. > > Sean I tend to run it on modules that need some reformatting (SearchIO::blast comes to mind). I believe you're correct when this comes down to programming style, but I think this echoes a sentiment (frustration, perhaps) that some of us have with long-term maintenance of said code. Maybe a compromise: include a copy of .perltidyrc with the distribution that goes by what a consensus wants or by the general rules laid out in Perl Best Practices (spaced settings, use of spaces over tabs, etc). Conversion would be encouraged but voluntary, with the caveat that if someone needs to clean up code down the road (bug fixes, enhancements, etc) and if the original author isn't able to add it in themselves, it could be perltidy'd in order to help the developer (locate and fix the issue)|(add relevant enhancement where needed). chris From cjfields at uiuc.edu Fri Jun 15 10:56:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:56:23 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> ... >>> Can we do any sort of massive conversion at some logical timepoint. >>> Probably after a branch release or something? Because it basically >>> means we're going to have differences on nearly every line which is >>> going to make diff-ing difficult when debugging old/new versions. >>> Maybe it is not a problem because we aren't introducing and new >>> bugs! > > Sorry, can you clarify the problem you envisage? And why would > making a branch release help? Maybe the worry is that mass conversion in such a large codebase could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o trying? >> I agree; if we intend on doing this it should be all at once, >> maybe on a branch dedicated to ensure that code changes don't >> tank tests (they shouldn't but one never knows). We would then >> need a script up- and-running that tidies everything up prior to >> commits (though what happens if perltidy tanks?...). >> Sendu, up for it? > > If its going to be difficult and a hassle, for such an unnecessary > thing I'm not sure its worth it. There are more pressing things to > be done for Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do > it. If that's not appropriate, I won't. The choices aren't necessarily all or nothing. What about voluntary, recommended use of a perltidy config file included with the distribution, with additional 'caveats'? See my response to Sean. >>>> About svn > [snip] >> Stepped into that one, didn't I! I'll look into how much effort >> is involved and try getting something going in the next month or >> two, maybe sooner if time permits. I'm lacking on SVN-foo as >> well but it might be worth looking into. > > I'd put this in the unnecessary-but-nice category as well. If it > will be as easy as my ->new change, go ahead. If not, there are > more pressing matters (POD fixing, test script updating and > finishing...). A few other open-bio projects have actively discussed a CVS->SVN migration (BioRuby and I think BioPython, though the latter could be wrong). As I said, "it might be worth looking into" to weigh the pros/cons, get others opinions from others who have made the transition, etc. We could, as Jason suggested, even set up a tester SVN w/o making it the default codebase (lock it off to a few testers, have CVS commits automatically/manually carry over to SVN, etc). I agree with you that it's not feasible to switch over prior to a release and that there are more pressing issues, but it doesn't hurt having an open discussion about it. chris From sdavis2 at mail.nih.gov Fri Jun 15 11:15:57 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 11:15:57 -0400 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD2D.2090001@mail.nih.gov> Chris Fields wrote: > > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary thing >>> I'm not sure its worth it. There are more pressing things to be done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do it. >>> If that's not appropriate, I won't. >> >> I agree with the sentiment noted above. I'm a bit of an outsider here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting (SearchIO::blast > comes to mind). I believe you're correct when this comes down to > programming style, but I think this echoes a sentiment (frustration, > perhaps) that some of us have with long-term maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the distribution > that goes by what a consensus wants or by the general rules laid out in > Perl Best Practices (spaced settings, use of spaces over tabs, etc). > Conversion would be encouraged but voluntary, with the caveat that if > someone needs to clean up code down the road (bug fixes, enhancements, > etc) and if the original author isn't able to add it in themselves, it > could be perltidy'd in order to help the developer (locate and fix the > issue)|(add relevant enhancement where needed). Don't get me wrong--I think whatever makes bioperl a better, more maintainable beast should be what is done. The bioperl gurus should absolutely do what is best for them for code maintainability. Sean From n.haigh at sheffield.ac.uk Fri Jun 15 11:17:15 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 16:17:15 +0100 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD7B.4050109@sheffield.ac.uk> Chris Fields wrote: > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing >>> I'm not sure its worth it. There are more pressing things to be >>> done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd >>> do it. >>> If that's not appropriate, I won't. >> I agree with the sentiment noted above. I'm a bit of an outsider >> here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I >> really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom >> that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting > (SearchIO::blast comes to mind). I believe you're correct when this > comes down to programming style, but I think this echoes a sentiment > (frustration, perhaps) that some of us have with long-term > maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the > distribution that goes by what a consensus wants or by the general > rules laid out in Perl Best Practices (spaced settings, use of spaces > over tabs, etc). RE spaces, tabs etc - how well is the different coding styles handled for displaying in html and via the online browsable cvs? Conversion would be encouraged but voluntary, with > the caveat that if someone needs to clean up code down the road (bug > fixes, enhancements, etc) and if the original author isn't able to > add it in themselves, it could be perltidy'd in order to help the > developer (locate and fix the issue)|(add relevant enhancement where > needed). > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnsonm at gmail.com Fri Jun 15 15:37:26 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Fri, 15 Jun 2007 14:37:26 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: Patches waiting in Bugzilla (Bug #2299). Changes: -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for prokaryotic reports (Glimmer2/Glimmer3) -Bio::Tools::Glimmer now produces features with Fuzzy or Split locations as appropriate (partial or circular/wraparound predictions) -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out sequence lengths -Bio::Tools::Run::Glimmer passes along the sequence length to Bio::Tools::Glimmer for Glimmer2 I should probably modify Bio::Tools::Genemark to use Bio::SeqFeature::Generic features for prokaryotic reports, to be consistent, but this is more likely to surprise people. If nobody screams about the change to Bio::Tools::Glimmer, I'll do it at some point. On 5/21/07, Chris Fields wrote: > > On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: > > >> glimmer2/3 both assume the genome is circular by default (I'm > >> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to > >> the Glimmer3 release notes the detail file has the information in the > >> header; from the Glimmer3 data used for tests: > > > > You beat me to the reply Chris - yes, Glimmer2/3 assume circular > > chromosome by default. I had forgotten about this in earlier > > discussions of the new Glimmer parsers as I normally run it in > > --linear / -L mode (even if I know it is circular) because it is > > easier to handle, and our sequencer/assembler team usually gets the > > origin of replication right. > > > >> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA > >> Glimmer3.icm Glimmer3 > > > > I did a double-take here - that's the path to my Glimmer3 > > installation! It took me a couple of minutes to realise that you got > > it from the bioperl test data I created. D'oh! :-) > > Yep, I forgot about that! > > >> There are options available for glimmer3 (-L, -X) that specify a > >> linear sequence or allow ORFs to extend past the end of the sequence > >> analyzed (the latter assumes a linear sequence). > > > > If the -L mode should produce Bio::Location::Split objects, I guess if > > -X is used > > it should produce Bio::Location::Fuzzy objects too... > > > > --Torsten > > True, didn't think about that one. Def. something to consider adding > in. > > chris > > > From cjfields at uiuc.edu Fri Jun 15 16:55:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 15:55:06 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: I'll try getting to that in tonight. Been pretty tied up lately... chris On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote: > Patches waiting in Bugzilla (Bug #2299). Changes: > > -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for > prokaryotic reports (Glimmer2/Glimmer3) > -Bio::Tools::Glimmer now produces features with Fuzzy or Split > locations as appropriate (partial or circular/wraparound predictions) > -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out > sequence lengths > -Bio::Tools::Run::Glimmer passes along the sequence length to > Bio::Tools::Glimmer for Glimmer2 > > I should probably modify Bio::Tools::Genemark to use > Bio::SeqFeature::Generic features for prokaryotic reports, to be > consistent, but this is more likely to surprise people. If nobody > screams about the change to Bio::Tools::Glimmer, I'll do it at some > point. > > On 5/21/07, Chris Fields wrote: >> >> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: >> >>>> glimmer2/3 both assume the genome is circular by default (I'm >>>> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to >>>> the Glimmer3 release notes the detail file has the information >>>> in the >>>> header; from the Glimmer3 data used for tests: >>> >>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular >>> chromosome by default. I had forgotten about this in earlier >>> discussions of the new Glimmer parsers as I normally run it in >>> --linear / -L mode (even if I know it is circular) because it is >>> easier to handle, and our sequencer/assembler team usually gets the >>> origin of replication right. >>> >>>> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ >>>> BCTDNA >>>> Glimmer3.icm Glimmer3 >>> >>> I did a double-take here - that's the path to my Glimmer3 >>> installation! It took me a couple of minutes to realise that you got >>> it from the bioperl test data I created. D'oh! :-) >> >> Yep, I forgot about that! >> >>>> There are options available for glimmer3 (-L, -X) that specify a >>>> linear sequence or allow ORFs to extend past the end of the >>>> sequence >>>> analyzed (the latter assumes a linear sequence). >>> >>> If the -L mode should produce Bio::Location::Split objects, I >>> guess if >>> -X is used >>> it should produce Bio::Location::Fuzzy objects too... >>> >>> --Torsten >> >> True, didn't think about that one. Def. something to consider adding >> in. >> >> chris >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rvos at interchange.ubc.ca Fri Jun 15 17:08:17 2007 From: rvos at interchange.ubc.ca (rvos) Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Hi, I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. Rutger -----Original Message----- > Date: Fri Jun 15 07:56:23 PDT 2007 > From: "Chris Fields" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sendu Bala" > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > >>>> ... > >>> Can we do any sort of massive conversion at some logical timepoint. > >>> Probably after a branch release or something? Because it basically > >>> means we're going to have differences on nearly every line which is > >>> going to make diff-ing difficult when debugging old/new versions. > >>> Maybe it is not a problem because we aren't introducing and new > >>> bugs! > > > > Sorry, can you clarify the problem you envisage? And why would > > making a branch release help? > > Maybe the worry is that mass conversion in such a large codebase > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > trying? > > >> I agree; if we intend on doing this it should be all at once, > >> maybe on a branch dedicated to ensure that code changes don't > >> tank tests (they shouldn't but one never knows). We would then > >> need a script up- and-running that tidies everything up prior to > >> commits (though what happens if perltidy tanks?...). > >> Sendu, up for it? > > > > If its going to be difficult and a hassle, for such an unnecessary > > thing I'm not sure its worth it. There are more pressing things to > > be done for Bioperl. > > > > If I can just run perltidy on the entire package and commit, I'd do > > it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? See my response to Sean. > > >>>> About svn > > [snip] > >> Stepped into that one, didn't I! I'll look into how much effort > >> is involved and try getting something going in the next month or > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >> well but it might be worth looking into. > > > > I'd put this in the unnecessary-but-nice category as well. If it > > will be as easy as my ->new change, go ahead. If not, there are > > more pressing matters (POD fixing, test script updating and > > finishing...). > > A few other open-bio projects have actively discussed a CVS->SVN > migration (BioRuby and I think BioPython, though the latter could be > wrong). As I said, "it might be worth looking into" to weigh the > pros/cons, get others opinions from others who have made the > transition, etc. We could, as Jason suggested, even set up a tester > SVN w/o making it the default codebase (lock it off to a few testers, > have CVS commits automatically/manually carry over to SVN, etc). > > I agree with you that it's not feasible to switch over prior to a > release and that there are more pressing issues, but it doesn't hurt > having an open discussion about it. > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From spiros at lokku.com Fri Jun 15 17:40:32 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Fri, 15 Jun 2007 22:40:32 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: On 6/15/07, rvos wrote: > Hi, > > I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. > > Rutger > I second that, SVN seems like the reasonable choice. I would be more than happy to help out as well. Spiros > > -----Original Message----- > > > Date: Fri Jun 15 07:56:23 PDT 2007 > > From: "Chris Fields" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sendu Bala" > > > > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > > > >>>> ... > > >>> Can we do any sort of massive conversion at some logical timepoint. > > >>> Probably after a branch release or something? Because it basically > > >>> means we're going to have differences on nearly every line which is > > >>> going to make diff-ing difficult when debugging old/new versions. > > >>> Maybe it is not a problem because we aren't introducing and new > > >>> bugs! > > > > > > Sorry, can you clarify the problem you envisage? And why would > > > making a branch release help? > > > > Maybe the worry is that mass conversion in such a large codebase > > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > > trying? > > > > >> I agree; if we intend on doing this it should be all at once, > > >> maybe on a branch dedicated to ensure that code changes don't > > >> tank tests (they shouldn't but one never knows). We would then > > >> need a script up- and-running that tidies everything up prior to > > >> commits (though what happens if perltidy tanks?...). > > >> Sendu, up for it? > > > > > > If its going to be difficult and a hassle, for such an unnecessary > > > thing I'm not sure its worth it. There are more pressing things to > > > be done for Bioperl. > > > > > > If I can just run perltidy on the entire package and commit, I'd do > > > it. If that's not appropriate, I won't. > > > > The choices aren't necessarily all or nothing. What about voluntary, > > recommended use of a perltidy config file included with the > > distribution, with additional 'caveats'? See my response to Sean. > > > > >>>> About svn > > > [snip] > > >> Stepped into that one, didn't I! I'll look into how much effort > > >> is involved and try getting something going in the next month or > > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > > >> well but it might be worth looking into. > > > > > > I'd put this in the unnecessary-but-nice category as well. If it > > > will be as easy as my ->new change, go ahead. If not, there are > > > more pressing matters (POD fixing, test script updating and > > > finishing...). > > > > A few other open-bio projects have actively discussed a CVS->SVN > > migration (BioRuby and I think BioPython, though the latter could be > > wrong). As I said, "it might be worth looking into" to weigh the > > pros/cons, get others opinions from others who have made the > > transition, etc. We could, as Jason suggested, even set up a tester > > SVN w/o making it the default codebase (lock it off to a few testers, > > have CVS commits automatically/manually carry over to SVN, etc). > > > > I agree with you that it's not feasible to switch over prior to a > > release and that there are more pressing issues, but it doesn't hurt > > having an open discussion about it. > > > > chris > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Fri Jun 15 18:10:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 18:10:25 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> So should we set up a sandbox svn repository and those who would like to help out - take shots at migrating bioperl (any current cvs snapshot will do) to svn - you document what you find yourself having to do in trying to make it work - you report back when you think you have a working repository - we all get a defined amount of time to test to our hearts' content, say 2 weeks - you fix issues that were encountered - report back when done, followed by retesting for, say 1 week - iterate previous 2 steps until no issues and no objections to migration - two more weeks of warning period to all developers to commit all outstanding changes, or reapply them to a future svn checkout - pull the trigger by locking down cvs, applying the migration as worked out before, and announcing that BioPerl is now on svn - get free beer at next BOSC (I'll pay if no one else does) This may not be precisely the plan that needs to be executed, but it's probably somewhere along those lines. If there are volunteers who would like to spearhead this, then power to you - I think everyone is in favor and the advantages of svn don't need to be debated. The only reason it hasn't happened yet is because no one has stepped forward who would have the energy. I'm sure ChrisD will gladly create the svn sandbox if we have volunteers lined up to get going. -hilmar On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > On 6/15/07, rvos wrote: >> Hi, >> >> I would very much prefer it if bioperl moved to svn. I'm >> considering merging Bio::Phylo (to the extent that that's possible/ >> practical) with bioperl and move it to an OBF repository, but I'd >> rather not go back to CVS. >> >> Rutger >> > > I second that, SVN seems like the reasonable choice. I would be more > than happy to help out as well. > > Spiros > >> >> -----Original Message----- >> >>> Date: Fri Jun 15 07:56:23 PDT 2007 >>> From: "Chris Fields" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sendu Bala" >>> >>> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> >>>>>>> ... >>>>>> Can we do any sort of massive conversion at some logical >>>>>> timepoint. >>>>>> Probably after a branch release or something? Because it >>>>>> basically >>>>>> means we're going to have differences on nearly every line >>>>>> which is >>>>>> going to make diff-ing difficult when debugging old/new versions. >>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>> bugs! >>>> >>>> Sorry, can you clarify the problem you envisage? And why would >>>> making a branch release help? >>> >>> Maybe the worry is that mass conversion in such a large codebase >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>> w/o >>> trying? >>> >>>>> I agree; if we intend on doing this it should be all at once, >>>>> maybe on a branch dedicated to ensure that code changes don't >>>>> tank tests (they shouldn't but one never knows). We would then >>>>> need a script up- and-running that tidies everything up prior to >>>>> commits (though what happens if perltidy tanks?...). >>>>> Sendu, up for it? >>>> >>>> If its going to be difficult and a hassle, for such an unnecessary >>>> thing I'm not sure its worth it. There are more pressing things to >>>> be done for Bioperl. >>>> >>>> If I can just run perltidy on the entire package and commit, I'd do >>>> it. If that's not appropriate, I won't. >>> >>> The choices aren't necessarily all or nothing. What about >>> voluntary, >>> recommended use of a perltidy config file included with the >>> distribution, with additional 'caveats'? See my response to Sean. >>> >>>>>>> About svn >>>> [snip] >>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>> is involved and try getting something going in the next month or >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>> well but it might be worth looking into. >>>> >>>> I'd put this in the unnecessary-but-nice category as well. If it >>>> will be as easy as my ->new change, go ahead. If not, there are >>>> more pressing matters (POD fixing, test script updating and >>>> finishing...). >>> >>> A few other open-bio projects have actively discussed a CVS->SVN >>> migration (BioRuby and I think BioPython, though the latter could be >>> wrong). As I said, "it might be worth looking into" to weigh the >>> pros/cons, get others opinions from others who have made the >>> transition, etc. We could, as Jason suggested, even set up a tester >>> SVN w/o making it the default codebase (lock it off to a few >>> testers, >>> have CVS commits automatically/manually carry over to SVN, etc). >>> >>> I agree with you that it's not feasible to switch over prior to a >>> release and that there are more pressing issues, but it doesn't hurt >>> having an open discussion about it. >>> >>> chris >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Fri Jun 15 18:23:15 2007 From: jason at bioperl.org (Jason Stajich) Date: Fri, 15 Jun 2007 15:23:15 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: Sounds like a plan, I'll be curious to see if we can still get keep anonymous CVS working as I'd like to not have to pull the plug on that. There are some threads out on the web about how to do this with a commit rule on SVN. Also, can someone who is close enough to all the SVN benefits please elaborate how it is going to help _this_ project? Perhaps you would be willing to put a few words up -- like on (a to be created): http://bioperl.org/wiki/BioPerl:Version_control_changeover This way if anonymous CVS is broken and/or developers who haven't been paying attention come back to commit code ask why things changed we don't have to compose long emails... =) -jason On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From sheris at eps.berkeley.edu Fri Jun 15 18:58:12 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 15:58:12 -0700 Subject: [Bioperl-l] seq doesn't validate error Message-ID: <200706151558.12911.sheris@eps.berkeley.edu> Hi, I'm getting an error as follows when I try to reverse complement a sequence string stored in a hash of arrays. The storage code is: $nstarthash{$key} = [$sortchecks[0], join("", @nseq), join("",@{$seqhash{$key}})]; the sequence of interest is the element at index 1. Later, I try to retrieve this string for a subset of keys so I can reverse complement it based on input from another hash (%complement): my %revcomphash = map { my $read = $_; grep $complement{$read} eq 'C', %complement; {$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};} keys(%nstarthash); I get the following warning (long sequence edited for clarity): -- -------------------- WARNING --------------------- MSG: seq doesn't validate, mismatch is 1 --------------------------------------------------- ------------- EXCEPTION ------------- MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] which does not look healthy STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK toplevel ../quality_wrapper.pl:103 I cannot find any non-allowed characters in the sequence, and the de-referencing appears to work correctly. Can anyone help me? I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a Mepis 6.5 system. Thanks Sheri --------------------------------------------------------------------- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From Kevin.M.Brown at asu.edu Fri Jun 15 19:11:34 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Fri, 15 Jun 2007 16:11:34 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> > I'm getting an error as follows when I try to reverse > complement a sequence string stored in a hash of arrays. The > storage code is: > > $nstarthash{$key} = [$sortchecks[0], join("", > @nseq), > join("",@{$seqhash{$key}})]; > > the sequence of interest is the element at index 1. > > Later, I try to retrieve this string for a subset of keys so > I can reverse complement it based on input from another hash > (%complement): > > my %revcomphash = map { my $read = $_; > grep $complement{$read} eq 'C', %complement; > {$_, (Bio::Seq->new(-seq > =>$nstarthash{$_}[1]))->revcom->seq()};} > keys(%nstarthash); > > > I get the following warning (long sequence edited for clarity): > > -- -------------------- WARNING --------------------- > MSG: seq doesn't validate, mismatch is 1 > --------------------------------------------------- > > ------------- EXCEPTION ------------- > MSG: Attempting to set the sequence to > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > which does not look healthy > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > toplevel ../quality_wrapper.pl:103 > > I cannot find any non-allowed characters in the sequence, and > the de-referencing appears to work correctly. Can anyone help me? > I'm using the latest Bioperl installation (1.5.2) with > ActivePerl5.8 on a Mepis 6.5 system. Try telling the Bio::Seq object what alphabet to use when creating it. I tend to create them like: Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') From sheris at eps.berkeley.edu Fri Jun 15 19:53:04 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 16:53:04 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> Message-ID: <200706151653.04135.sheris@eps.berkeley.edu> Thanks for the suggestion, but that still gives the same error as before. On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: > > I'm getting an error as follows when I try to reverse > > complement a sequence string stored in a hash of arrays. The > > storage code is: > > > > $nstarthash{$key} = [$sortchecks[0], join("", > > @nseq), > > join("",@{$seqhash{$key}})]; > > > > the sequence of interest is the element at index 1. > > > > Later, I try to retrieve this string for a subset of keys so > > I can reverse complement it based on input from another hash > > (%complement): > > > > my %revcomphash = map { my $read = $_; > > grep $complement{$read} eq 'C', %complement; > > {$_, (Bio::Seq->new(-seq > > =>$nstarthash{$_}[1]))->revcom->seq()};} > > keys(%nstarthash); > > > > > > I get the following warning (long sequence edited for clarity): > > > > -- -------------------- WARNING --------------------- > > MSG: seq doesn't validate, mismatch is 1 > > --------------------------------------------------- > > > > ------------- EXCEPTION ------------- > > MSG: Attempting to set the sequence to > > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > > which does not look healthy > > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > > toplevel ../quality_wrapper.pl:103 > > > > I cannot find any non-allowed characters in the sequence, and > > the de-referencing appears to work correctly. Can anyone help me? > > I'm using the latest Bioperl installation (1.5.2) with > > ActivePerl5.8 on a Mepis 6.5 system. > > Try telling the Bio::Seq object what alphabet to use when creating it. > I tend to create them like: > > Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') -- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From hlapp at gmx.net Fri Jun 15 21:27:42 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 21:27:42 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: Could you post a ticket to the helpdesk: support at open-bio.org. -hilmar On Jun 15, 2007, at 9:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Fri Jun 15 21:08:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 15 Jun 2007 21:08:32 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <18035.14352.963113.473274@almost.alerce.com> Hilmar Lapp writes: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn Free Beer, huh? Do you deliver? Can you package up a tarball of the cvs repository (bzip or gzip would save some time) itself? thanks! g. From cjfields at uiuc.edu Fri Jun 15 21:42:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:42:05 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> The browsable CVS has a 'Download tarball' link if that helps. http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? cvsroot=bioperl chris On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. From cjfields at uiuc.edu Fri Jun 15 21:50:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:50:09 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> I'll help out to the extent I can w/o having the SVN know-how. We need (as Jason points out) someone who can detail the benefits and maybe keep an updated journal on the wiki. I believe at least one or two of the other Bio* contemplated moving over to SVN, which may be worth checking out. chris On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Jun 15 22:12:55 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 22:12:55 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> I think he meant the cvs repository itself, containing all the change data. -hilmar On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > The browsable CVS has a 'Download tarball' link if that helps. > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? > cvsroot=bioperl > > chris > > On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > >> Hilmar Lapp writes: >>> So should we set up a sandbox svn repository and those who would >>> like >>> to help out >>> >>> - take shots at migrating bioperl (any current cvs snapshot will do) >>> to svn >> >> Free Beer, huh? Do you deliver? >> >> Can you package up a tarball of the cvs repository (bzip or gzip >> would >> save some time) itself? >> >> thanks! >> >> g. > > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Jun 15 22:37:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 21:37:55 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: Ah, got it. Sorry. George, planning on taking this up? chris On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote: > I think he meant the cvs repository itself, containing all the > change data. -hilmar > > On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > >> The browsable CVS has a 'Download tarball' link if that helps. >> >> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? >> cvsroot=bioperl >> >> chris >> >> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: >> >>> Hilmar Lapp writes: >>>> So should we set up a sandbox svn repository and those who would >>>> like >>>> to help out >>>> >>>> - take shots at migrating bioperl (any current cvs snapshot will >>>> do) >>>> to svn >>> >>> Free Beer, huh? Do you deliver? >>> >>> Can you package up a tarball of the cvs repository (bzip or gzip >>> would >>> save some time) itself? >>> >>> thanks! >>> >>> g. >> >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sat Jun 16 04:20:57 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 16 Jun 2007 09:20:57 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <46739D69.4090204@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Hilmar Lapp writes: > > So should we set up a sandbox svn repository and those who would like > > to help out > > > > - take shots at migrating bioperl (any current cvs snapshot will do) > > to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds like George might know what he's doing! I have a question about setting up svn access. I believe access can be done in several ways, over webdav, over ssh and probably others too. Do you have any knowledge about the benefits of one over the other? I suppose I'm thinking of what to implement to allow anonymous read access for users and authenticated access for developers. Nath p.s. if you need any monkeys to do some work I'm happy to help out as much as possible. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9 Lb9NUEe4dkCakQ+Gc7Py98A= =BG9m -----END PGP SIGNATURE----- From rvos at interchange.ubc.ca Sat Jun 16 06:37:11 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca> I can volunteer some time to help out with this. Rutger -----Original Message----- > Date: Fri Jun 15 15:10:25 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: spiros at lokku.com > > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > > > On 6/15/07, rvos wrote: > >> Hi, > >> > >> I would very much prefer it if bioperl moved to svn. I'm > >> considering merging Bio::Phylo (to the extent that that's possible/ > >> practical) with bioperl and move it to an OBF repository, but I'd > >> rather not go back to CVS. > >> > >> Rutger > >> > > > > I second that, SVN seems like the reasonable choice. I would be more > > than happy to help out as well. > > > > Spiros > > > >> > >> -----Original Message----- > >> > >>> Date: Fri Jun 15 07:56:23 PDT 2007 > >>> From: "Chris Fields" > >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > >>> To: "Sendu Bala" > >>> > >>> > >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > >>> > >>>>>>> ... > >>>>>> Can we do any sort of massive conversion at some logical > >>>>>> timepoint. > >>>>>> Probably after a branch release or something? Because it > >>>>>> basically > >>>>>> means we're going to have differences on nearly every line > >>>>>> which is > >>>>>> going to make diff-ing difficult when debugging old/new versions. > >>>>>> Maybe it is not a problem because we aren't introducing and new > >>>>>> bugs! > >>>> > >>>> Sorry, can you clarify the problem you envisage? And why would > >>>> making a branch release help? > >>> > >>> Maybe the worry is that mass conversion in such a large codebase > >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows > >>> w/o > >>> trying? > >>> > >>>>> I agree; if we intend on doing this it should be all at once, > >>>>> maybe on a branch dedicated to ensure that code changes don't > >>>>> tank tests (they shouldn't but one never knows). We would then > >>>>> need a script up- and-running that tidies everything up prior to > >>>>> commits (though what happens if perltidy tanks?...). > >>>>> Sendu, up for it? > >>>> > >>>> If its going to be difficult and a hassle, for such an unnecessary > >>>> thing I'm not sure its worth it. There are more pressing things to > >>>> be done for Bioperl. > >>>> > >>>> If I can just run perltidy on the entire package and commit, I'd do > >>>> it. If that's not appropriate, I won't. > >>> > >>> The choices aren't necessarily all or nothing. What about > >>> voluntary, > >>> recommended use of a perltidy config file included with the > >>> distribution, with additional 'caveats'? See my response to Sean. > >>> > >>>>>>> About svn > >>>> [snip] > >>>>> Stepped into that one, didn't I! I'll look into how much effort > >>>>> is involved and try getting something going in the next month or > >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >>>>> well but it might be worth looking into. > >>>> > >>>> I'd put this in the unnecessary-but-nice category as well. If it > >>>> will be as easy as my ->new change, go ahead. If not, there are > >>>> more pressing matters (POD fixing, test script updating and > >>>> finishing...). > >>> > >>> A few other open-bio projects have actively discussed a CVS->SVN > >>> migration (BioRuby and I think BioPython, though the latter could be > >>> wrong). As I said, "it might be worth looking into" to weigh the > >>> pros/cons, get others opinions from others who have made the > >>> transition, etc. We could, as Jason suggested, even set up a tester > >>> SVN w/o making it the default codebase (lock it off to a few > >>> testers, > >>> have CVS commits automatically/manually carry over to SVN, etc). > >>> > >>> I agree with you that it's not feasible to switch over prior to a > >>> release and that there are more pressing issues, but it doesn't hurt > >>> having an open discussion about it. > >>> > >>> chris > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sdavis2 at mail.nih.gov Sat Jun 16 07:21:47 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Sat, 16 Jun 2007 07:21:47 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> Message-ID: <4673C7CB.1030709@mail.nih.gov> Chris Fields wrote: > I'll help out to the extent I can w/o having the SVN know-how. We > need (as Jason points out) someone who can detail the benefits and > maybe keep an updated journal on the wiki. > > I believe at least one or two of the other Bio* contemplated moving > over to SVN, which may be worth checking out. > The bioconductor project is on SVN. The project includes over 200 packages (the equivalent of perl modules) with something around 150-200 ACTIVE developers. They also have a build system for several OSes that operates on a cron-like system with builds of several versions approximately daily. Their system is running at something like revision 30,000, so they have significant experience. If anyone would like technical support, I can certainly ask the folks maintaining their site if they can give some input. Let me know if anyone would like a contact person. As for access, the typical access is over http (or https). Access controls can be set up on the server side while allowing anonymous access for checkout. There are many excellent SVN for every OS, so that should not be a problem. Sean From cjfields at uiuc.edu Sat Jun 16 10:02:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 09:02:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> On Jun 16, 2007, at 6:21 AM, Sean Davis wrote: > Chris Fields wrote: >> I'll help out to the extent I can w/o having the SVN know-how. We >> need (as Jason points out) someone who can detail the benefits and >> maybe keep an updated journal on the wiki. >> >> I believe at least one or two of the other Bio* contemplated moving >> over to SVN, which may be worth checking out. >> > The bioconductor project is on SVN. The project includes over 200 > packages (the equivalent of perl modules) with something around > 150-200 > ACTIVE developers. They also have a build system for several OSes > that > operates on a cron-like system with builds of several versions > approximately daily. Their system is running at something like > revision > 30,000, so they have significant experience. If anyone would like > technical support, I can certainly ask the folks maintaining their > site > if they can give some input. Let me know if anyone would like a > contact > person. > > As for access, the typical access is over http (or https). Access > controls can be set up on the server side while allowing anonymous > access for checkout. There are many excellent SVN for every OS, so > that > should not be a problem. > > Sean It looks like George Hartzell may be taking a crack at it, with Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we could have something testable relatively soon. After that we'll need to work out a few other issues, basically what's on Hilmar's list. chris From hlapp at gmx.net Sat Jun 16 10:40:08 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:40:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net> Just as an aside, even if we can't keep anonymous cvs working, I would think that using apache URL rewriting and a small CGI script that returns an appropriate page redirect we can without too much trouble keep the hyperlinks functional that people may have bookmarked -hilmar On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote: > Sounds like a plan, I'll be curious to see if we can still get keep > anonymous CVS working as I'd like to not have to pull the plug on > that. There are some threads out on the web about how to do this > with a commit rule on SVN. > > Also, can someone who is close enough to all the SVN benefits > please elaborate how it is going to help _this_ project? > Perhaps you would be willing to put a few words up -- like on (a to > be created): > http://bioperl.org/wiki/BioPerl:Version_control_changeover > > This way if anonymous CVS is broken and/or developers who haven't > been paying attention come back to commit code ask why things > changed we don't have to compose long emails... =) > > -jason > On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn >> >> - you document what you find yourself having to do in trying to make >> it work >> >> - you report back when you think you have a working repository >> >> - we all get a defined amount of time to test to our hearts' content, >> say 2 weeks >> >> - you fix issues that were encountered >> >> - report back when done, followed by retesting for, say 1 week >> >> - iterate previous 2 steps until no issues and no objections to >> migration >> >> - two more weeks of warning period to all developers to commit all >> outstanding changes, or reapply them to a future svn checkout >> >> - pull the trigger by locking down cvs, applying the migration as >> worked out before, and announcing that BioPerl is now on svn >> >> - get free beer at next BOSC (I'll pay if no one else does) >> >> This may not be precisely the plan that needs to be executed, but >> it's probably somewhere along those lines. >> >> If there are volunteers who would like to spearhead this, then power >> to you - I think everyone is in favor and the advantages of svn don't >> need to be debated. The only reason it hasn't happened yet is because >> no one has stepped forward who would have the energy. > >> >> I'm sure ChrisD will gladly create the svn sandbox if we have >> volunteers lined up to get going. >> >> -hilmar >> >> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: >> >>> On 6/15/07, rvos wrote: >>>> Hi, >>>> >>>> I would very much prefer it if bioperl moved to svn. I'm >>>> considering merging Bio::Phylo (to the extent that that's possible/ >>>> practical) with bioperl and move it to an OBF repository, but I'd >>>> rather not go back to CVS. >>>> >>>> Rutger >>>> >>> >>> I second that, SVN seems like the reasonable choice. I would be more >>> than happy to help out as well. >>> >>> Spiros >>> >>>> >>>> -----Original Message----- >>>> >>>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>>> From: "Chris Fields" >>>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>>> To: "Sendu Bala" >>>>> >>>>> >>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>>> >>>>>>>>> ... >>>>>>>> Can we do any sort of massive conversion at some logical >>>>>>>> timepoint. >>>>>>>> Probably after a branch release or something? Because it >>>>>>>> basically >>>>>>>> means we're going to have differences on nearly every line >>>>>>>> which is >>>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>>> versions. >>>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>>> bugs! >>>>>> >>>>>> Sorry, can you clarify the problem you envisage? And why would >>>>>> making a branch release help? >>>>> >>>>> Maybe the worry is that mass conversion in such a large codebase >>>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>>> w/o >>>>> trying? >>>>> >>>>>>> I agree; if we intend on doing this it should be all at once, >>>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>>> need a script up- and-running that tidies everything up prior to >>>>>>> commits (though what happens if perltidy tanks?...). >>>>>>> Sendu, up for it? >>>>>> >>>>>> If its going to be difficult and a hassle, for such an >>>>>> unnecessary >>>>>> thing I'm not sure its worth it. There are more pressing >>>>>> things to >>>>>> be done for Bioperl. >>>>>> >>>>>> If I can just run perltidy on the entire package and commit, >>>>>> I'd do >>>>>> it. If that's not appropriate, I won't. >>>>> >>>>> The choices aren't necessarily all or nothing. What about >>>>> voluntary, >>>>> recommended use of a perltidy config file included with the >>>>> distribution, with additional 'caveats'? See my response to Sean. >>>>> >>>>>>>>> About svn >>>>>> [snip] >>>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>>> is involved and try getting something going in the next >>>>>>> month or >>>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>>> well but it might be worth looking into. >>>>>> >>>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>>> more pressing matters (POD fixing, test script updating and >>>>>> finishing...). >>>>> >>>>> A few other open-bio projects have actively discussed a CVS->SVN >>>>> migration (BioRuby and I think BioPython, though the latter >>>>> could be >>>>> wrong). As I said, "it might be worth looking into" to weigh the >>>>> pros/cons, get others opinions >from others who have made the >>>>> transition, etc. We could, as Jason suggested, even set up a >>>>> tester >>>>> SVN w/o making it the default codebase (lock it off to a few >>>>> testers, >>>>> have CVS commits automatically/manually carry over to SVN, etc). >>>>> >>>>> I agree with you that it's not feasible to switch over prior to a >>>>> release and that there are more pressing issues, but it doesn't >>>>> hurt >>>>> having an open discussion about it. >>>>> >>>>> chris >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Jun 16 10:55:09 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:55:09 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > As for access, the typical access is over http (or https). We're using svn+ssh here (NESCent) so the password is the same as the one you set for your account on the server, and you can use public/ private key negotiation for authentication. I think the ability to not provide a password for every single interaction is a requirement. If that requires using svn+ssh or can be made to work through https too I don't know. On sf.net I have to use https for svn and it doesn't ask me for the password each time. Not sure how this works though, maybe some local caching? We should not be using http, or whatever other protocol that sends unencrypted passwords. > Access controls can be set up on the server side while allowing > anonymous access for checkout. There are many excellent SVN for > every OS, so that should not be a problem. On Mac OSX the most convenient way I have found is through fink. It does ask to install 30 other dependencies, which had me balk at first, but me doing it by hand is even worse than fink doing it, so I finally gave in and it's really a breeze. I've not had a single issue. From a sysadmin perspective, what might be worth keeping in mind is that svn is going to store everything in a database (BerkeleyDB I think). I.e., there is no such thing anymore as restoring individual source code files from backup if one gets accidentally corrupted on the server. It seems you have to restore the entire database, i.e., the entire repository. I vaguely recall though that how svn manages the repository is actually configurable and that other storage than DB is possible too. Don't ask me for the pros and cons of one vs the other. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Sat Jun 16 13:09:18 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). Rutger -----Original Message----- > Date: Sat Jun 16 07:55:09 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sean Davis" > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. > > > Access controls can be set up on the server side while allowing > > anonymous access for checkout. There are many excellent SVN for > > every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rvos at interchange.ubc.ca Sat Jun 16 13:15:45 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore). Rutger -----Original Message----- > Date: Sat Jun 16 10:09:18 PDT 2007 > From: "rvos" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Hilmar Lapp" , "Sean Davis" > > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > > -----Original Message----- > > > Date: Sat Jun 16 07:55:09 PDT 2007 > > From: "Hilmar Lapp" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sean Davis" > > > > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > > > As for access, the typical access is over http (or https). > > > > We're using svn+ssh here (NESCent) so the password is the same as the > > one you set for your account on the server, and you can use public/ > > private key negotiation for authentication. > > > > I think the ability to not provide a password for every single > > interaction is a requirement. If that requires using svn+ssh or can > > be made to work through https too I don't know. On sf.net I have to > > use https for svn and it doesn't ask me for the password each time. > > Not sure how this works though, maybe some local caching? > > > > We should not be using http, or whatever other protocol that sends > > unencrypted passwords. > > > > > Access controls can be set up on the server side while allowing > > > anonymous access for checkout. There are many excellent SVN for > > > every OS, so that should not be a problem. > > > > On Mac OSX the most convenient way I have found is through fink. It > > does ask to install 30 other dependencies, which had me balk at > > first, but me doing it by hand is even worse than fink doing it, so I > > finally gave in and it's really a breeze. I've not had a single issue. > > > > From a sysadmin perspective, what might be worth keeping in mind is > > that svn is going to store everything in a database (BerkeleyDB I > > think). I.e., there is no such thing anymore as restoring individual > > source code files from backup if one gets accidentally corrupted on > > the server. It seems you have to restore the entire database, i.e., > > the entire repository. I vaguely recall though that how svn manages > > the repository is actually configurable and that other storage than > > DB is possible too. Don't ask me for the pros and cons of one vs the > > other. > > > > -hilmar > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From george.heller at yahoo.com Sat Jun 16 13:29:26 2007 From: george.heller at yahoo.com (George Heller) Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com> Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? George --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From bix at sendu.me.uk Sat Jun 16 14:21:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 16 Jun 2007 19:21:38 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com> References: <959624.48556.qm@web56502.mail.re3.yahoo.com> Message-ID: <46742A32.90305@sendu.me.uk> George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). From cjfields at uiuc.edu Sat Jun 16 15:23:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:23:43 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote: > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > >> As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. Agreed; it should be through ssh. >> Access controls can be set up on the server side while allowing >> anonymous access for checkout. There are many excellent SVN for >> every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. MacPorts/DarwinPorts also has subversion, various language bindings, cvs2svn, and various perl modules. There are also a few SVN GUIs lingering around (including live folders within Komodo). chris From cjfields at uiuc.edu Sat Jun 16 15:18:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:18:06 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu> I think it's viable as an option if the code really needs it. After 100+ commits some of the code has schizy coding styles, so cleaning it up helps. In those cases having a perltidy config file present wouldn't hurt. However I agree that it shouldn't be applied across every module and should be done judiciously (the commit message, for instance, should actually state the code was tidied). chris PS - Nice to see the ball is rolling on SVN! On Jun 16, 2007, at 12:15 PM, rvos wrote: > A brief word on the topic of perltidy: no. I like what it does, and > I sort of follow one of its settings (-syn -sob -b), but if you run > it on a whole source tree it'll screw up the diffs, and I'm still > worried about it breaking things (though really it shouldn't, it > creates a *.bak if something doesn't compile anymore). > > Rutger > > > > -----Original Message----- > >> Date: Sat Jun 16 10:09:18 PDT 2007 >> From: "rvos" >> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >> To: "Hilmar Lapp" , "Sean Davis" >> >> >> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales >> talk has been expended over it already, for my own purpose I like >> the integration with eclipse (through subclipse plugin) and >> komodo, in addition to the atomic commits (so I can ctrl+c if I >> goof up (again)). >> >> For standalone use on osx I didn't use the fink one, but I forgot >> where I did get it from. It was very easy to set up, though. On >> windows there is a really nice standalone one (tortoisesvn) that >> integrates with the explorer so you can see on the file icons what >> the state of a file is. I know that there's a cvs2svn utility that >> converts your revision history (seems a requirement). >> >> Rutger >> >> >> -----Original Message----- >> >>> Date: Sat Jun 16 07:55:09 PDT 2007 >>> From: "Hilmar Lapp" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sean Davis" >>> >>> >>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >>> >>>> As for access, the typical access is over http (or https). >>> >>> We're using svn+ssh here (NESCent) so the password is the same as >>> the >>> one you set for your account on the server, and you can use public/ >>> private key negotiation for authentication. >>> >>> I think the ability to not provide a password for every single >>> interaction is a requirement. If that requires using svn+ssh or can >>> be made to work through https too I don't know. On sf.net I have to >>> use https for svn and it doesn't ask me for the password each time. >>> Not sure how this works though, maybe some local caching? >>> >>> We should not be using http, or whatever other protocol that sends >>> unencrypted passwords. >>> >>>> Access controls can be set up on the server side while allowing >>>> anonymous access for checkout. There are many excellent SVN for >>>> every OS, so that should not be a problem. >>> >>> On Mac OSX the most convenient way I have found is through fink. It >>> does ask to install 30 other dependencies, which had me balk at >>> first, but me doing it by hand is even worse than fink doing it, >>> so I >>> finally gave in and it's really a breeze. I've not had a single >>> issue. >>> >>> From a sysadmin perspective, what might be worth keeping in >>> mind is >>> that svn is going to store everything in a database (BerkeleyDB I >>> think). I.e., there is no such thing anymore as restoring individual >>> source code files from backup if one gets accidentally corrupted on >>> the server. It seems you have to restore the entire database, i.e., >>> the entire repository. I vaguely recall though that how svn manages >>> the repository is actually configurable and that other storage than >>> DB is possible too. Don't ask me for the pros and cons of one vs the >>> other. >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 16 13:47:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 10:47:01 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: <18036.8725.29073.619527@almost.alerce.com> Chris Fields writes: > Ah, got it. Sorry. > > George, planning on taking this up? I'm going to take a *peek*. I just finished (unless someone finds another issue) moving someone's cvs repository over to svn, so I have some tools cobbled together and some knowledge in the cache. I don't have too much idle time at the moment though, so if it gets gooey I'll just summarize what I learn. Either way it seems worth a peek. I will need the repository itself though. I'll post a note to support at open-bio.org. g. From jason at bioperl.org Sat Jun 16 19:54:18 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 16:54:18 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18036.8725.29073.619527@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> <18036.8725.29073.619527@almost.alerce.com> Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org> Thanks George. I'll respond to your support ticket as well but I put up tarballs of the repository as of today. I had thought at one point ChrisD might have setup rsync-able access to the whole repostitory through code.open-bio.org but for now I have put up tarballs of most of the CVS dirs from bioperl http://bioperl.org/uploads/ Just to say I already went through all the steps of running cvs2svn myself and had problems gathering back out the branches and all the tags when I tried it. If you want to start with a smaller repository like bioperl-network or bioperl-db as the initial cvs2svn conversion script took quite a long time to run on bioperl-live. Regarding ssh/https: We have already gone through some of this for blipkit and biojava projects. I think we'll still keep separate anonymous read-only (code.open-bio.org) and writeable repositories (dev.open-bio.org) as I think we are resisting any webapps on the developement server as we want that to as locked down as possible. For the newly created svn repositories that I've been creating/using I just use svn+ssh and that worked okay. -jason On Jun 16, 2007, at 10:47 AM, George Hartzell wrote: > Chris Fields writes: >> Ah, got it. Sorry. >> >> George, planning on taking this up? > > I'm going to take a *peek*. I just finished (unless someone finds > another issue) moving someone's cvs repository over to svn, so I have > some tools cobbled together and some knowledge in the cache. > > I don't have too much idle time at the moment though, so if it gets > gooey I'll just summarize what I learn. Either way it seems worth a > peek. > > I will need the repository itself though. I'll post a note to > support at open-bio.org. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hartzell at alerce.com Sat Jun 16 19:56:09 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 16:56:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46739D69.4090204@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <46739D69.4090204@sheffield.ac.uk> Message-ID: <18036.30873.609341.181853@almost.alerce.com> Nathan S. Haigh writes: > [...] > Sounds like George might know what he's doing! Hey, I've been looking for a Marketing Director. Want a job? > I have a question about > setting up svn access. I believe access can be done in several ways, > over webdav, over ssh and probably others too. Do you have any knowledge > about the benefits of one over the other? I suppose I'm thinking of what > to implement to allow anonymous read access for users and authenticated > access for developers. There are two and a half ways to talk to the repository: - You can put it behind a web server (e.g. apache) and get at it using http/https. Authentication and authorization happen using the normal web server tricks, so as long as you don't do anything silly (e.g. don't use basic auth, stick with mod_auth_digest), even http connections won't send passwords in the clear. You can define users in .htpassword files or use any of the fancier setup (e.g. sql databases, etc...). - You can talk to it via subversion's simple server, svnserve. There are two ways you usually talk to svnserve (neither of which send passwords in the clear): * directly, using a URL like svn:/svn.example.com/repo/proj/trunk when you do this the client either talks directly to a copy of svnserve running as a daemon, or possibly to something like inetd that'll start an svnserve as necessary. In this case, you define authen. and author. info in an svnserve.conf file. * indirectly, using a URL like svn+ssh://svn.example.com/repo/proj/trunk/ in which case you make an ssh connection to the server machine (and authenticate via ssh mechanisms, anything other than a key-pair will drive you nuts with repeated password requests) and then an svnserve process is started up for you in "tunnel mode". Access control is coarse grained an via OS level access permisions. Generally in this case you need to give out shell accounts to everyone involved, or (tsk, tsk) have them use a common account. There's a cute trick in the svn book that shows how to use a shared ssh account but still have all of the changes in the repo keep track of the real user. I've never tried it.... - If you're on the same machine as the repo, you can do this simple: file:///path/to/repo/proj/trunk The biggest deciding factor is how you want to manage your users and whether you're already messing around with a web server. I've generally worked in small group and everyone's had ssh access, but I've set it up the other ways too. You can even access via multiple paths. The only trick is that the repository needs to be writable by whoever's committing, and if they're running svnserve themselves (file: or svn+ssh:) and things aren't set up right (all the dirs in the repo need to be group writable and have the magic bit set so that any new stuff created is also writable, users umasks and group membership need to be aligned) then things go fubar. Google's your friend here, and each of the OS's/distro's has a standard hack for making this work, usually involving a wrapper app that takes care of things. Feel free to ask any particular questions. Phew, g. From jason at bioperl.org Sat Jun 16 20:17:58 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 17:17:58 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> <200706151653.04135.sheris@eps.berkeley.edu> Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org> There error is clearly saying there must be a symbol or letter in your sequence that violates the regexp. I had modified the code in CVS to actually provide a more informative mismatch error in the error message, but this probably not in the release you are using. Anyways, add this to see what is causing the problem: print join(",",($nstarthash{$_}[1] =~ /([^ $Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n"; -jason On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote: > Thanks for the suggestion, but that still gives the same error as > before. > > On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: >>> I'm getting an error as follows when I try to reverse >>> complement a sequence string stored in a hash of arrays. The >>> storage code is: >>> >>> $nstarthash{$key} = [$sortchecks[0], join("", >>> @nseq), >>> join("",@{$seqhash{$key}})]; >>> >>> the sequence of interest is the element at index 1. >>> >>> Later, I try to retrieve this string for a subset of keys so >>> I can reverse complement it based on input from another hash >>> (%complement): >>> >>> my %revcomphash = map { my $read = $_; >>> grep $complement{$read} eq 'C', %complement; >>> {$_, (Bio::Seq->new(-seq >>> =>$nstarthash{$_}[1]))->revcom->seq()};} >>> keys(%nstarthash); >>> >>> >>> I get the following warning (long sequence edited for clarity): >>> >>> -- -------------------- WARNING --------------------- >>> MSG: seq doesn't validate, mismatch is 1 >>> --------------------------------------------------- >>> >>> ------------- EXCEPTION ------------- >>> MSG: Attempting to set the sequence to >>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] >>> which does not look healthy >>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 >>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 >>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK >>> toplevel ../quality_wrapper.pl:103 >>> >>> I cannot find any non-allowed characters in the sequence, and >>> the de-referencing appears to work correctly. Can anyone help me? >>> I'm using the latest Bioperl installation (1.5.2) with >>> ActivePerl5.8 on a Mepis 6.5 system. >> >> Try telling the Bio::Seq object what alphabet to use when creating >> it. >> I tend to create them like: >> >> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') > > -- > Sheri Simmons > Department of Earth and Planetary Sciences > University of California, Berkeley > Berkeley, CA 94720-4767 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From n.haigh at sheffield.ac.uk Sun Jun 17 07:45:11 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 17 Jun 2007 12:45:11 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <46751EC7.8020609@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 rvos wrote: > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > Just to clarify, subversion is available as command line for windows: http://subversion.tigris.org/project_packages.html TortoiseSVN is another svn client with a GUI that integrates into the shell. I tried setting this up a while back to use ssh (via PUTTY), but I wasn't successful. This may have been due to me just starting out with svn or that it was harder to setup in an earlier version of TortoiseSVN. Does anyone have experience of setting up svn on Windows to use ssh? If the changeover takes place, I'm happy to write some howto's for setting up svn clients for Windows. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v 8xHJvn/Eqf9LePR3Ei0ZaIw= =t5pN -----END PGP SIGNATURE----- From george.heller at yahoo.com Sun Jun 17 14:41:55 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com> Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From jason at bioperl.org Sun Jun 17 16:48:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Sun, 17 Jun 2007 13:48:05 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com> References: <148654.15952.qm@web56511.mail.re3.yahoo.com> Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: > Hi all, > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > Thanks. > George > > Sendu Bala wrote: > George Heller wrote: >> Hi all, >> >> I am looking at extracting the taxonomy hierarchy for some taxon ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From aaron.j.mackey at gsk.com Sun Jun 17 22:35:42 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:35:42 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: To do so efficiently, you might want to check out: http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html -Aaron bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM: > George Heller wrote: > > Hi all, > > > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > > What I plan to do is, for a given taxon id, say 33090, I want to > > extract all taxon ids that are children of this species. I do not > > just want the immediate children, but the children's children and so > > on. > > > > Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not share it > with us and we could add it to the Taxonomy module(s). > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From aaron.j.mackey at gsk.com Sun Jun 17 22:34:12 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:34:12 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: Message-ID: > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) Let me just note that https is preferable to ssh for those poor slobs stuck behind a corporate firewall (svn happily prompts me for my proxy server's user/pass, then my https authentication realm's user/pass - all then get cached in some .svn/ file that I don't have to worry about again until my proxy server password changes once a month ...) -Aaron From george.heller at yahoo.com Mon Jun 18 00:21:45 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com> Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. From bix at sendu.me.uk Mon Jun 18 06:44:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:44:00 +0100 Subject: [Bioperl-l] Network tests overhaul Message-ID: <467661F0.2060703@sendu.me.uk> When the test suite runs currently, most (the intent is all) tests skip if the test would require network (internet) access. This is to avoid tests failing not due to bugs in Bioperl code, but due to temporarily inaccessible servers. This is also to make running the test suite faster. To do a complete test you currently have to set BIOPERLDEBUG to true, which activates the network test but also increases verbosity. This actually causes a problem, since when running the entire test suite the additional debug information is more a hindrance than a help, since the reams of printed information can hide significant warnings that may also get printed. Its also ugly. The solution is to divorce activation of network tests from the request for verbosity. The obvious implementation is to have another environment variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do something more appropriate. The running of networking tests should be a choice given to every end-user installing Bioperl. Debugging information, on the other hand, is only of interest to the developer working on a specific module under test, so can be left as a 'hidden' env var. I have just committed one possible implementation along these lines. You say: perl Build.PL as normal, and if you seem to have internet access it asks you if you'd like to run network tests. The default answer is no. If you answer yes, network tests will be enabled. You can alternatively say: perl Build.PL --network and if you seem to have internet access, network tests will be enabled. Then you run the tests: ./Build test Any tests written to support the new system will then skip network tests if they haven't been enabled. The only test I've written to support the new system is t/RemoteBlast.t: ./Build test --test_files t/RemoteBlast.t --verbose Adding support to test scripts consists of the following changes: + use Module::Build; + my $build = Module::Build->current(get_options => { network => {} }); + my $do_network_tests = $build->notes('network'); ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests --- ! if (!$do_network_tests) { # skip network tests I propose adding this support to all test scripts that carry out network tests. Does anyone have objections? Does anyone have alternate implementations that may be superior? I specifically suggest we don't use an env var in addition to the above, because the multiple ways of doing things could lead to confusion. Which takes priority? Did a user really have the networking tests turned on when he reported his test results? The one thing I need help with is identifying which tests attempt to access the internet. I think we caught most of them for the 1.5.2 release, but I think there are more lurking around. Can anyone offer a way to systematically find at least the test scripts which access the internet, if not the specific tests within? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 06:46:17 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:46:17 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: <46766279.7050202@sendu.me.uk> Sendu Bala wrote: > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => {} }); That should read: + my $build = Module::Build->current(); > + my $do_network_tests = $build->notes('network'); From cjfields at uiuc.edu Mon Jun 18 07:45:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 06:45:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <46766279.7050202@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: The idea sounds good, though if we plan on doing this we need to update the Test HOWTO as well. Some modules require only a few (<50% of the total) network tests; I think SeqFeature.t may be one, though I'm not sure. Does this handle those cases? chris On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Adding support to test scripts consists of the following changes: >> >> + use Module::Build; >> + my $build = Module::Build->current(get_options => { network => >> {} }); > > That should read: > + my $build = Module::Build->current(); > >> + my $do_network_tests = $build->notes('network'); > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Jun 18 07:49:18 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 12:49:18 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: <4676713E.1000508@sendu.me.uk> Chris Fields wrote: > The idea sounds good, though if we plan on doing this we need to update > the Test HOWTO as well. > > Some modules require only a few (<50% of the total) network tests; I > think SeqFeature.t may be one, though I'm not sure. Does this handle > those cases? Yes, the system just gives the test script a boolean describing if network tests should be run. The script can then do whatever it wants with the boolean. Skip all tests, skip no tests, skip just some tests... its a drop-in replacement for the current 'debug' boolean used based on BIOPERLDEBUG. From hlapp at gmx.net Mon Jun 18 08:38:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:38:25 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com> References: <487845.37410.qm@web56510.mail.re3.yahoo.com> Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 08:44:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:44:22 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Just curious - how do you cvs commit then to an external repository? Is that open in the firewall? It is true though that corporations typically will not permit any encrypted outgoing traffic through their firewall except https. sf.net only supports https for svn, AFAIK. -hilmar On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass > - all > then get cached in some .svn/ file that I don't have to worry about > again > until my proxy server password changes once a month ...) > > -Aaron > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 08:47:56 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:47:56 -0400 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Sounds like a great idea to me. -hilmar On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote: > When the test suite runs currently, most (the intent is all) tests > skip > if the test would require network (internet) access. This is to avoid > tests failing not due to bugs in Bioperl code, but due to temporarily > inaccessible servers. This is also to make running the test suite > faster. > > To do a complete test you currently have to set BIOPERLDEBUG to true, > which activates the network test but also increases verbosity. This > actually causes a problem, since when running the entire test suite > the > additional debug information is more a hindrance than a help, since > the > reams of printed information can hide significant warnings that may > also > get printed. Its also ugly. > > The solution is to divorce activation of network tests from the > request > for verbosity. The obvious implementation is to have another > environment > variable, perhaps BIOPERLNETWORK. However, there is an opportunity > to do > something more appropriate. The running of networking tests should > be a > choice given to every end-user installing Bioperl. Debugging > information, on the other hand, is only of interest to the developer > working on a specific module under test, so can be left as a 'hidden' > env var. > > > I have just committed one possible implementation along these lines. > > You say: > perl Build.PL > as normal, and if you seem to have internet access it asks you if > you'd > like to run network tests. The default answer is no. If you answer > yes, > network tests will be enabled. > > You can alternatively say: > perl Build.PL --network > and if you seem to have internet access, network tests will be > enabled. > > Then you run the tests: > ./Build test > Any tests written to support the new system will then skip network > tests > if they haven't been enabled. > > The only test I've written to support the new system is t/ > RemoteBlast.t: > ./Build test --test_files t/RemoteBlast.t --verbose > > > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => > {} }); > + my $do_network_tests = $build->notes('network'); > > ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests > --- > ! if (!$do_network_tests) { # skip network tests > > > I propose adding this support to all test scripts that carry out > network > tests. Does anyone have objections? Does anyone have alternate > implementations that may be superior? > > I specifically suggest we don't use an env var in addition to the > above, > because the multiple ways of doing things could lead to confusion. > Which > takes priority? Did a user really have the networking tests turned on > when he reported his test results? > > > The one thing I need help with is identifying which tests attempt to > access the internet. I think we caught most of them for the 1.5.2 > release, but I think there are more lurking around. Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 08:55:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 07:55:53 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote: > Just curious - how do you cvs commit then to an external repository? > Is that open in the firewall? > > It is true though that corporations typically will not permit any > encrypted outgoing traffic through their firewall except https. > sf.net only supports https for svn, AFAIK. > > -hilmar If so it may be better to allow https, though I don't know how Chris D. and others feel about it. Did we make a decision as to the fate of cvs if we get svn up-and- running? Keep it around (assuming svn commits would be carried over to cvs and vice versa)? Or see what happens over time? chris From sdavis2 at mail.nih.gov Mon Jun 18 09:05:50 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 09:05:50 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <4676832E.5080704@mail.nih.gov> aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass - all > then get cached in some .svn/ file that I don't have to worry about again > until my proxy server password changes once a month ...) That would be my suggestion as well (although I added it only parenthetically). Sean From hlapp at gmx.net Mon Jun 18 09:13:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 09:13:27 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried > over to cvs and vice versa)? Or see what happens over time? Let's not plan for having cvs and svn writable repositories in parallel - that would create an administrative nightmare. Once the tests complete, there'll be a clean cut-over. What Jason suggested is to try and continue a read-only (anonymous) cvs repository, updated from the svn repository that the developers use, aside from an anonymous svn repository mirroring the writable one. This would primarily be for maintaining working URLs for those folks who http-linked into the anonymous cvs repository. What I added earlier is that even if that fails to be feasible, you can achieve the goal using some small CGI script and apache redirect to map CVS- style links to the anonymous svn repository. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 09:31:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:31:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu> On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote: > > On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Let's not plan for having cvs and svn writable repositories in > parallel - that would create an administrative nightmare. Once the > tests complete, there'll be a clean cut-over. My thoughts as well. Much simpler. > What Jason suggested is to try and continue a read-only (anonymous) > cvs repository, updated from the svn repository that the developers > use, aside from an anonymous svn repository mirroring the writable > one. This would primarily be for maintaining working URLs for those > folks who http-linked into the anonymous cvs repository. What I > added earlier is that even if that fails to be feasible, you can > achieve the goal using some small CGI script and apache redirect to > map CVS-style links to the anonymous svn repository. > > -hilmar I like the idea of a read-only cvs or a 'faux' cvs, though the former would initially be easier as we already have it available. We could just lock it down at some switchover point to read-only (something I think Jason also suggested). chris From bix at sendu.me.uk Mon Jun 18 09:13:33 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:13:33 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> Message-ID: <467684FD.3080300@sendu.me.uk> Chris Fields wrote: > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing I'm not sure its worth it. There are more pressing things to be >> done for Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd do >> it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? I'm happy with that idea. Why not come up with something and make it available for us to try out? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 09:26:36 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:26:36 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <4676880C.9030009@sendu.me.uk> Chris Fields wrote: > If so it may be better to allow https, though I don't know how Chris > D. and others feel about it. If it makes no difference to me as an end-user, I won't mind. But I won't want to enter my password even once, at the beginning of a session. If that's not possible with https, then ssh should be an option as well. Unrelated, but it randomly just occurred to me: what happens to all the id lines at the top of modules? Eg: $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ That's a cvs-specific thing, right? Do we delete them all? (Regardless, I wish we would, since they caused me no end of hassles during the 1.5.2 release, doing updates across branches.) > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried over > to cvs and vice versa)? Or see what happens over time? Well, I don't think hard decisions are possible until we know how its going to work in practice. I tried setting up my own svn repository once, but didn't keep it and can't remember much about it. So, I suppose we'll play it by ear and decide things later. Is someone out there actively doing something leading toward a demonstration of how it will be? From cjfields at uiuc.edu Mon Jun 18 09:58:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:58:34 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467684FD.3080300@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing I'm not sure its worth it. There are more pressing things >>> to be >>> done for Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do >>> it. If that's not appropriate, I won't. >> >> The choices aren't necessarily all or nothing. What about voluntary, >> recommended use of a perltidy config file included with the >> distribution, with additional 'caveats'? > > I'm happy with that idea. Why not come up with something and make it > available for us to try out? > > > Cheers, > Sendu. Will do. Maybe something that conforms to PBP; there's a PBP perltidy config on perlmonks, along with some emacs/vim related bits: http://www.perlmonks.org/?node_id=516501 chris From sdavis2 at mail.nih.gov Mon Jun 18 10:03:35 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 10:03:35 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <467690B7.7090105@mail.nih.gov> Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how Chris >> D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an option > as well. > > > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) See here: http://svnbook.red-bean.com/en/1.0/ch07s02.html Check out the section at the bottom having to do with svn:keywords. Sean From akarger at CGR.Harvard.edu Mon Jun 18 10:10:57 2007 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 18 Jun 2007 10:10:57 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46751EC7.8020609@sheffield.ac.uk> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> <46751EC7.8020609@sheffield.ac.uk> Message-ID: > Just to clarify, subversion is available as command line for windows: > http://subversion.tigris.org/project_packages.html > > TortoiseSVN is another svn client with a GUI that integrates into the > shell. I tried setting this up a while back to use ssh (via > PUTTY), but > I wasn't successful. This may have been due to me just > starting out with > svn or that it was harder to setup in an earlier version of > TortoiseSVN. > > Does anyone have experience of setting up svn on Windows to > use ssh? If > the changeover takes place, I'm happy to write some howto's > for setting > up svn clients for Windows. Here are some notes I wrote recently. I'm using this with command-line svn, not TortoiseSVN. I would hope that it would work with Tortoise, too, but I can't guarantee. 1. Run PuTTYgen (installed with PuTTY, probably in Start menu->Programs->PuTTY) and follow directions to create a private key file like C:\someplace\private_key.ppk and a public key. At this point, you'll pick an ssh password, which is separate from your login password. 2. Get an account with the appropriate .ssh/authorized_keys file on the host machine. (This is not Windows-specific. By the way, if you change the lines of the authorized_keys file to start with, e.g., command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB... comment then (a) you're more secure because users can't open a real shell on the computer, and (b) users don't need to type the repository directory in their svn co commands.) 3. Set your environment variables (My Computer->Properties. Advanced Tab, click on Environment Variables. In the top half ("User variables for ..."), click "New" and put in the variable name and value. 3a. Set the SVN_EDITOR environment variable to your favorite editor, such as vim or emacs, or a full path to some other editor. If it's not set, then either VISUAL or EDITOR must be set. 3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program, which is the Windows equivalent of command-line ssh. If you installed PuTTY in the default location, set it to "C:/Program Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the quotation marks in the environment variable. 4. When you want to start using svn, you'll need to run Pageant (Start menu->Programs->PuTTY), select "Add Key", browse to your private key file, and enter the ssh password you chose in step 1 (not your login password). Pageant will stay running until you quit it or logout, so you can have multiple svn checkins etc., and you only need to type in your password once. 5. Now just run command-line svn commands the same way you would on UNIX (modulo Windows' brain-dead shell). -Amir Karger From cjfields at uiuc.edu Mon Jun 18 10:24:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 09:24:00 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how >> Chris D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an > option as well. Aaron pointed out in a related post that https access is the preferred option behind a corporate firewall (svn prompts for proxy user/pass, then caches it). Not sure how Jason/Hilmar/Chris D. feel about https or supporting both https+ssh. ... >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Well, I don't think hard decisions are possible until we know how > its going to work in practice. I tried setting up my own svn > repository once, but didn't keep it and can't remember much about it. Agree; we'll need to work out specifics once we know how things work out using cvs2svn. I think the idea is to test using a smaller distribution (maybe network or db) and move up from there. > So, I suppose we'll play it by ear and decide things later. Is > someone out there actively doing something leading toward a > demonstration of how it will be? George Hartzell is going to test it out, I believe, and will post something when he can. chris From dmessina at wustl.edu Mon Jun 18 10:54:31 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 09:54:31 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> [Chris F] > Will do. Maybe something that conforms to PBP; there's a PBP > perltidy config on perlmonks, along with some emacs/vim related bits: > > http://www.perlmonks.org/?node_id=516501 FYI, perltidy now has a built-in -pbp flag: [from perltidy-20070508] > -pbp, --perl-best-practices > -pbp is an abbreviation for the parameters in the book Perl Best > Practices by Damian Conway: > > -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 > -nsfs -nolq > -wbb="% + - * / x != == >= <= =~ !~ < > | & = > **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" > Note that the -st and -se flags make perltidy act as a filter on > one file only. These can be overridden with -nst and -nse if > necessary. > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ bin/perltidy] Dave From dmessina at wustl.edu Mon Jun 18 11:04:10 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 10:04:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Awesome, Sendu! Really glad you implemented this. > Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? I think tests would be accessing the net indirectly through a BioPerl module (which may also be using indirect access), so it'd be hard to come up with a universal glob for that. However: % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l 108 % ls -1 bioperl-live/t | wc -l 248 Less than half of the test files use BIOPERLDEBUG, so that narrows down the possibilities... Dave From bix at sendu.me.uk Mon Jun 18 11:09:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 16:09:19 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> Message-ID: <4676A01F.30205@sendu.me.uk> David Messina wrote: >> Can anyone offer a >> way to systematically find at least the test scripts which access the >> internet, if not the specific tests within? > > I think tests would be accessing the net indirectly through a BioPerl > module (which may also be using indirect access), so it'd be hard to > come up with a universal glob for that. > > However: > > % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l > 108 > > % ls -1 bioperl-live/t | wc -l > 248 > > Less than half of the test files use BIOPERLDEBUG, so that narrows down > the possibilities... Not necessarily. The problem is that there may be test scripts that have never even tried to skip network tests, and therefore don't use BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) I was thinking along the lines of, does anyone know how to monitor accesses to the network card (or equivalent), getting information on which program (test script) requested the access? From cjfields at uiuc.edu Mon Jun 18 11:41:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 10:41:28 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> Message-ID: On Jun 18, 2007, at 9:54 AM, David Messina wrote: > [Chris F] >> Will do. Maybe something that conforms to PBP; there's a PBP >> perltidy config on perlmonks, along with some emacs/vim related bits: >> >> http://www.perlmonks.org/?node_id=516501 > > > FYI, perltidy now has a built-in -pbp flag: > > [from perltidy-20070508] >> -pbp, --perl-best-practices >> -pbp is an abbreviation for the parameters in the book Perl Best >> Practices by Damian Conway: >> >> -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 >> -nsfs -nolq >> -wbb="% + - * / x != == >= <= =~ !~ < > | & = >> **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" >> Note that the -st and -se flags make perltidy act as a filter on >> one file only. These can be overridden with -nst and -nse if >> necessary. >> > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ > bin/perltidy] > > > Dave Makes sense that would eventually be incorporated. If so there's no need to include a config (unless we want to sway away from PBP-style). We can just recommend everyone use that setting. chris From cjfields at uiuc.edu Mon Jun 18 12:06:26 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:06:26 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote: > David Messina wrote: >>> ... >> Less than half of the test files use BIOPERLDEBUG, so that narrows >> down >> the possibilities... > > Not necessarily. The problem is that there may be test scripts that > have > never even tried to skip network tests, and therefore don't use > BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) > > I was thinking along the lines of, does anyone know how to monitor > accesses to the network card (or equivalent), getting information on > which program (test script) requested the access? EUtilities.t uses network tests predominately. I'll switch over when I commit everything from the overhaul. Couldn't you enable BIOPERLDEBUG, disable network access, then iterate through tests checking for those which fail or skip? I think Test::Harness has a way to do this, using execute_tests(). chris From bix at sendu.me.uk Mon Jun 18 12:34:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 17:34:38 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> Message-ID: <4676B41E.3050706@sendu.me.uk> Chris Fields wrote: > Couldn't you enable BIOPERLDEBUG, disable network access, then iterate > through tests checking for those which fail or skip? Yes, good idea, though my dev machine is also my email/webserver so I'd rather come up with an alternate solution than one involving 'disable network access'. Still, that's what I'll probably end up doing. Cheers! Oh, Chris, Spiros, how goes the Test::More conversion? I might want to wait for you to finish, or join in? If you're not going to have time to do any more in the next few weeks, can you please update http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in the opposite case, add your name in)? Its not quite clear to me which tests are assigned to whom. Can someone clarify what the markings mean? Cheers, Sendu. From cjfields at uiuc.edu Mon Jun 18 12:43:31 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:43:31 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676B41E.3050706@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > Chris Fields wrote: >> Couldn't you enable BIOPERLDEBUG, disable network access, then >> iterate through tests checking for those which fail or skip? > > Yes, good idea, though my dev machine is also my email/webserver so > I'd rather come up with an alternate solution than one involving > 'disable network access'. > > Still, that's what I'll probably end up doing. Cheers! > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > to wait for you to finish, or join in? If you're not going to have > time to do any more in the next few weeks, can you please update > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > in the opposite case, add your name in)? Its not quite clear to me > which tests are assigned to whom. Can someone clarify what the > markings mean? > > Cheers, > Sendu. Not sure how far along spiros is; I handed it over after I finished up to the 'Q' tests. In general the ones marked out have been converted over, ones with names next to them have been claimed. If you need help I'll prob. start back up again to finish them off; we just need to divy them up. chris From george.heller at yahoo.com Mon Jun 18 13:07:59 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com> What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. From jason at bioperl.org Mon Jun 18 13:53:28 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 10:53:28 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, > > relation "node" does not exist. > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > shift->throw_not_implemented(); > > Thanks. > George. > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > BioPerl doesn't have a Taxonomy::biosql module yet (though this would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download > to achieve what you wanted to do in a less than 5 lines of perl. > > Although the recursive implementation of Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > -hilmar > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > >> Thanks. And how can I assign the $node here in the below code, such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> Thanks. >> George >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> You just want the extant species/leaves of the tree >> >> >> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> Hi all, >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> Thanks. >> George >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> >> Any ideas on the way I can go about doing this? >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: > mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Mon Jun 18 18:10:00 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:10:00 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net> https is working fine for me for sf.net repositories, and I only have to enter the password upon first commit (since checkout doesn't even need a password). -hilmar On Jun 18, 2007, at 10:24 AM, Chris Fields wrote: > Not sure how Jason/Hilmar/Chris D. feel about https or supporting > both https+ssh -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 18:18:21 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com> I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. From hlapp at gmx.net Mon Jun 18 18:27:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:27:19 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: On Jun 18, 2007, at 1:07 PM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, Sorry, replace with "taxon". Jason answered the rest. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 18:33:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 17:33:40 -0500 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: > I tried running the below mentioned script and I seem to be getting > the following error: > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > My script looks something like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > And I am running the script using the command, > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > and I have the nodes.dmp and names.dmp files in the current > directory. > > Thanks, > George > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > > > -jason > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > relation "node" does not exist. > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > shift->throw_not_implemented(); > > > Thanks. > George. > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > -hilmar > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > Thanks. > George > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > You just want the extant species/leaves of the tree > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descedents; > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > Hi all, > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > Thanks. > George > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > > > Any ideas on the way I can go about doing this? > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 18 18:50:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:50:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> Message-ID: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 19:05:42 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com> This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. From jason at bioperl.org Mon Jun 18 19:22:08 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 16:22:08 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com> References: <706979.34648.qm@web56509.mail.re3.yahoo.com> Message-ID: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: > This is the output of /usr/bin/perl -V > > Summary of my perl5 (revision 5 version 8 subversion 5) configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- > linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- > E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > Thanks. > George > . > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > George, can you please post the output of > > $ /usr/bin/perl -V > > -hilmar > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > >> As the error implies your local version of perl doesn't seem support >> weak references, which means it doesn't have Scalar::Utils (which was >> added to core after perl 5.6.1, I think). Try installing >> Scalar::Utils to see what happens. >> >> chris >> >> On Jun 18, 2007, at 5:18 PM, George Heller wrote: >> >>> I tried running the below mentioned script and I seem to be getting >>> the following error: >>> >>> Weak references are not implemented in the version of perl at / >>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >>> Bio/Tree/Node.pm line 76. >>> Compilation failed in require at my.pl line 7. >>> BEGIN failed--compilation aborted at my.pl line 7. >>> >>> My script looks something like, >>> >>> #!/usr/bin/perl >>> use strict; >>> #use warnings; >>> use DBI; >>> use Bio::Tree::Node; >>> use Bio::DB::Taxonomy; >>> use Bio::DB::Taxonomy::flatfile; >>> my $idx_dir = '/tmp'; >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> foreach $field (@extant_children) { >>> print "$field"; >>> print "|"; >>> print "\n"; >>> } >>> >>> And I am running the script using the command, >>> >>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >>> >>> and I have the nodes.dmp and names.dmp files in the current >>> directory. >>> >>> Thanks, >>> George >>> >>> >>> Jason Stajich wrote: >>> It is implemented in the implementing class - DB::Taxonomy is >>> just the base class. For example see the flatfile implementation >>> Bio::DB::Taxonomy::flatfile >>> >>> See the scripts/taxa/local_taxonomydb_query.PLS for example using >>> it: >>> nodes and names are from NCBI taxonomy database. >>> >>> >>> Here is an un-debugged copy+paste for your question that *should* >>> work. >>> >>> >>> use Bio::DB::Taxonomy >>> my $idx_dir = '/tmp'; >>> >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> >>> >>> >>> -jason >>> >>> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >>> >>> What exactly is the "node n" in the query below. When I issue >>> this query, it says, >>> >>> >>> relation "node" does not exist. >>> >>> >>> I tried to use the get_all_Descendents method but it looks like >>> in order to do a recursive call it calls the method >>> each_Descendent. This method is not implemented in >>> Bio::DB::Taxonomy. It just has a single line, >>> >>> >>> shift->throw_not_implemented(); >>> >>> >>> Thanks. >>> George. >>> >>> >>> Hilmar Lapp wrote: >>> I'm a bit confused - it sounds like you have set up a local >>> BioSQL >>> database and loaded the NCBI taxonomy into the database. You can >>> now >>> use simple SQL to retrieve all descendants of a node in the tree >>> given its NCBI taxonID such as >>> >>> >>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >>> WHERE >>> n.ncbi_taxon_id = :taxonID >>> AND tn.left_value > n. left_value >>> AND tn.right_value < n.right_value >>> AND tn.taxon_id = tnm.taxon_id >>> AND tn.name_class = 'scientific_name' >>> >>> >>> BioPerl doesn't have a Taxonomy::biosql module yet (though this >>> would >>> seem like a worthwhile thing to add), so you can't use the >>> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >>> >>> >>> However, BioPerl does have support for the flat-file download of >>> the >>> NCBI taxonomy database and indexes it, so you can simply use >>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >>> download >>> to achieve what you wanted to do in a less than 5 lines of perl. >>> >>> >>> Although the recursive implementation of >>> Taxonomy::get_all_Descendants >>> () won't be lightning fast, it may still be perfectly fine for your >>> application - are you sure it is not? >>> >>> >>> -hilmar >>> >>> >>> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >>> >>> >>> Thanks. And how can I assign the $node here in the below code, >>> such >>> that I can reference it to a particular taxon id record? I want to >>> retrieve all the descendents from the taxonomy hierarchy, given a >>> particular taxon id. >>> >>> >>> I have a local db setup, in which I have uploaded data using the >>> load_ncbi_taxonomy.pl script. >>> >>> >>> Thanks. >>> George >>> >>> >>> Jason Stajich wrote: >>> I assume you already figured out how to setup a local taxonomydb? >>> >>> >>> >>> >>> You just want the extant species/leaves of the tree >>> >>> >>> >>> >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descedents; >>> >>> >>> >>> >>> >>> >>> -jason >>> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >>> >>> >>> Hi all, >>> >>> >>> >>> >>> Can anyone point me to some example that uses the >>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >>> this, and I am not quite sure how to implement it. >>> >>> >>> >>> >>> Thanks. >>> George >>> >>> >>> >>> >>> Sendu Bala wrote: >>> George Heller wrote: >>> Hi all, >>> >>> >>> >>> >>> I am looking at extracting the taxonomy hierarchy for some taxon >>> ids. >>> What I plan to do is, for a given taxon id, say 33090, I want to >>> extract all taxon ids that are children of this species. I do not >>> just want the immediate children, but the children's children >>> and so >>> on. >>> >>> >>> >>> >>> Any ideas on the way I can go about doing this? >>> >>> >>> >>> >>> Well, you'll use Bio::DB::Taxonomy presumably, and >>> each_Descendent in >>> some kind of looping structure. Most easily a recursing sub. >>> >>> >>> >>> >>> If you happen to code up something neat and efficient, why not >>> share it >>> with us and we could add it to the Taxonomy module(s). >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Shape Yahoo! in your own image. Join our Network Research Panel >>> today! >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Need a vacation? Get great deals to amazing places on Yahoo! >>> Travel. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Take the Internet to Go: Yahoo!Go puts the Internet in your >>> pocket: mail, news, photos & more. >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Bored stiff? Loosen up... >>> Download and play hundreds of games for free on Yahoo! Games. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Mon Jun 18 20:04:00 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com> Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. From jason at bioperl.org Mon Jun 18 20:17:34 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 17:17:34 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com> References: <424035.72876.qm@web56507.mail.re3.yahoo.com> Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: > Ok, I installed the latest of Scalar::Util and the script seems to > be working. But I am confused where exactly I need to look for the > descendent taxon ids once the script is run. I did look into the / > tmp/ directory, but I couldnt understand much. > > Sorry to be bothering, really appreaciate your patience. > > Thanks. > George > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > This is the output of /usr/bin/perl -V > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > Thanks. > George > . > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > > George, can you please post the output of > > > $ /usr/bin/perl -V > > > -hilmar > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils (which > was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > chris > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > My script looks something like, > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > And I am running the script using the command, > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > Thanks, > George > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > -jason > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > relation "node" does not exist. > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > shift->throw_not_implemented(); > > > > > Thanks. > George. > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > > > -hilmar > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > Thanks. > George > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > Hi all, > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > Thanks. > George > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Mon Jun 18 20:29:31 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com> But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. From jason at bioperl.org Mon Jun 18 21:05:43 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 18:05:43 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: > But the problem is that I don't really get any output on the > screen. In the /tmp directory I get 4 files namely parents, nodes, > id2names and names2id, but I dont know what to make of them. This > is what my script looks like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > my $nodefile; > my $namesfile; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodefile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > Thanks. > George > > Jason Stajich wrote: > All the children are in this array. > > > You get to decide what you want to do with them. In the following > example I print the id, rank, and scientific name out to the screen. > Because this is a taxonomy db query you are getting back > Bio::Taxonomy::Taxon objects so read the documentation for this > module to see what you can do with the object. > I would also suggest spending a little time with the Getting > started and HOWTO:Trees documentation on the website to get > familiar with the objects and nomenclature. > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > > On Jun 18, 2007, at 5:04 PM, George Heller wrote: > > Ok, I installed the latest of Scalar::Util and the script seems > to be working. But I am confused where exactly I need to look for > the descendent taxon ids once the script is run. I did look into > the /tmp/ directory, but I couldnt understand much. > > > Sorry to be bothering, really appreaciate your patience. > > > Thanks. > George > > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > > This is the output of /usr/bin/perl -V > > > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat > 3.4.6-2)', gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > > > Thanks. > George > . > > > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something > strange > appears to be going on too. > > > > > George, can you please post the output of > > > > > $ /usr/bin/perl -V > > > > > -hilmar > > > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils > (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > > > chris > > > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ > 5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > > > My script looks something like, > > > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > > > And I am running the script using the command, > > > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > > > Thanks, > George > > > > > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > > > > > > > > > -jason > > > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > > > > > relation "node" does not exist. > > > > > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > > > > > shift->throw_not_implemented(); > > > > > > > > > Thanks. > George. > > > > > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for > your > application - are you sure it is not? > > > > > > > > > -hilmar > > > > > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > > > > > Thanks. > George > > > > > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a > newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > > > > > > > > > Thanks. > George > > > > > > > > > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s > user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From torsten.seemann at infotech.monash.edu.au Mon Jun 18 21:21:04 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:21:04 +1000 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: Sendu, > >> Can anyone offer a > >> way to systematically find at least the test scripts which access the > >> internet, if not the specific tests within? Perhaps you could use 'strace' to list network system calls for each test script, and grep out AF_INET connections? % strace -e trace=network command_to_test 2>&1 | grep AF_INET I'm not an strace expert but it might do what you need. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From george.heller at yahoo.com Mon Jun 18 21:16:10 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com> Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help! Thanks. George Jason Stajich wrote: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. From torsten.seemann at infotech.monash.edu.au Mon Jun 18 21:26:41 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:26:41 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: (Sean, please reply to the bioperl-l list rather than to me personally so everyone can read it. i'm reposting it here) > > I posted this on the gbrowse list earlier. I'm looking to convert gff > > data files into xml. Does anyone know of a module written to do this > > already? > > What DTD do you want the XML to conform to? > eg. ChadoXML, TinySeq XML, TIGR XML ... ? Hi Torsten, I'm collaborating with other groups and want web-service compatible functionality for various tools. Normally the analysis tools I'm using generate gff output. I'm going to have to wrap this output in XML with XSL stylesheet for end-users to view. Haven't done it before and don't know what DTD to use. The bp_seqconvert.pl doesn't accept gff format. I would imagine the DTD would be quite short as the gff files are very standard, I just don't have any experience with these DTD requirements. --Sean O'Keeffe From sac at bioperl.org Tue Jun 19 02:42:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 18 Jun 2007 23:42:27 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> On 6/16/07, Jason Stajich wrote: > [...] > Just to say I already went through all the steps of running cvs2svn > myself and had problems gathering back out the branches and all the > tags when I tried it. If you want to start with a smaller repository > like bioperl-network or bioperl-db as the initial cvs2svn conversion > script took quite a long time to run on bioperl-live. Might this been a good opportunity to investigate partitioning bioperl-live into sub-repositories? There has been talk in the past of defining a set of "core" modules separate from other functionally related groups of modules that would be viewed as optional extensions. The goal being to help manage growth and simplify releases. There are currently 892 modules under Bio/. In addition to simplifying the migration to SVN, it would also have other benefits. Say some new functionality or a slew of fixes were added to Bio::Graphics. We could turn around a new Bio::Graphics release quickly without having to work on getting various other parts up to snuff that aren't related to graphics (Biblio, DB, PopGen, Search etc.). Maintenance and releases of the various extensions would be more parallelizable, orchestrated by separate ring leaders. Over time, as a set of functionality matures, it would see fewer updates and there would be less of a need for users to download/install/test it. This could make bioperl easier to customize, extend, and grok in general. Long term, it should ease development and release cycles, but it will involve a bit of near term bullet-biting. We'd need to get clear on how to partition things, including modules, tests, docs, installation logic, etc. and we'd probably need new integration tests to verify that the subsets continue working together. What do folks think? Would this SVN-based, re-partitioned bioperl-live constitute a 2.0 release? Any volunteers to help assemble a roadmap and milestones? Should I go on dreaming? Cheers, Steve From bix at sendu.me.uk Tue Jun 19 03:01:05 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:01:05 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: <46777F31.7030402@sendu.me.uk> Jason Stajich wrote: > The reason it isn't printing anything is someone didn't really write > the implementation quite right. This code was overhauled by Sendu > before the last release I guess something didn't quite get connected. > > I checked in code that has the Bio::Taxon delegating now to a DB > handle for the each_Descendent call. > You can either patch your code or just use the code listed here: > http://bioperl.org/wiki/Module:Bio::DB::Taxonomy I've reverted that change. For some reason the docs for Bio::Taxon::each_Descendent aren't showing up on the website, but they state: --- Note that this method never asks the database for the descendents; it will only return objects you have manually set with add_Descendent(), or where this was done for you by making a Bio::Tree::Tree with this object as an argument to new(). To get the database descendents use $taxon->db_handle->each_Descendent($taxon). --- I also have a note in the Synopsis for the module: --- # Though be careful with each_Descendent - unless you add_Descendent() # yourself, you won't get an answer because unlike for ancestor(), # Bio::Taxon does not ask the database for the answer. You can ask the # database yourself using the same method: ($human) = $homo->db_handle->each_Descendent($homo); --- This is quite deliberate and is to prevent Bad Things from happening. (Can't exactly remember the reasoning now, but I know it was good.) From bix at sendu.me.uk Tue Jun 19 03:41:57 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:41:57 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <467788C5.6070406@sendu.me.uk> Steve Chervitz wrote: > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? There has been talk in the past of > defining a set of "core" modules separate from other functionally > related groups of modules that would be viewed as optional extensions. > The goal being to help manage growth and simplify releases. There are > currently 892 modules under Bio/. > > In addition to simplifying the migration to SVN, it would also have > other benefits. Say some new functionality or a slew of fixes were > added to Bio::Graphics. We could turn around a new Bio::Graphics > release quickly without having to work on getting various other parts > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > Search etc.). Maintenance and releases of the various extensions would > be more parallelizable, orchestrated by separate ring leaders. > > Over time, as a set of functionality matures, it would see fewer > updates and there would be less of a need for users to > download/install/test it. This could make bioperl easier to customize, > extend, and grok in general. > > Long term, it should ease development and release cycles I actually take the opposite view. Breaking things up makes testing and releases more difficult. If one person acts as pumpkin for all the sub-parts, his work-load increases almost linearly with the number of sub-parts. If each sub-part gets its own pumpkin, where do all these pumpkins come from? It seems to me that frequently authors will write modules but inevitably their circumstance changes and they can no longer devote the time to look after them. Having a single pumpkin and 'forcing' him to make sure everything works (regardless of his personal interest in the module) seems more reliable than hoping there will be a person interested enough in each sub-part to handle its release. Since all sub-parts will at the least interact with the 'true' core set of Bioperl modules, they need to be tested and potentially re-released every time the true core is updated. And since some sub-parts will interact with other sub-parts, there will need to be coordinated joint-testing and release of multiple sub-parts. What happens when users report problems? We ask them what version they're running. Right now '1.5.2' means a specific thing, and its trivial for someone to confirm the same problem by installing 1.5.2. What happens when users have to list out all the versions of all the sub-parts they have? Who is going to consistently recreate a users hodge-podge of versions in order to confirm a bug? Won't the advice instead be: "update all versions to the latest and get back to us"? So, as I see it, all sub-parts would best be tested and released with a single new version number every time one sub-part is updated (significantly). In which case, why have sub-parts at all? Keeping things the way they are now means ease of release for the pumpkin and ease of installation for end-users (only one install command to issue to CPAN). Having 'true' sub-parts (each with its own pumpkin), in my fatalistic view, is just going to lead to some useful sub-parts being abandoned and never updated, even where updates may be desirable. Each and every Bio:: module could have been released separately by its respective author. As I see it, one of the main values of 'Bioperl' is that its one (reasonably) consistent collection of modules that lowers the barrier of entry for new Bioinformaticians, giving them extremely easy access to a whole host of functionality with a single install. From hlapp at gmx.net Tue Jun 19 08:47:02 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 08:47:02 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46777F31.7030402@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> So the real mistake was to write my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; instead of my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents ($node); I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the database? If this is correct, can we highlight this in the documentation? It's a small difference that everyone failed to spot. If it is not correct, then maybe we need to revisit the rationale for why a Bio::DB::Taxonomy::get_all_Descendents may not query the underlying database. Also, in my reading of Bio::Taxonomy::Taxon it won't use the database either for ancestor(). Which would be consistent with its other methods. I.e., the bottom line is don't use Node or Taxon objects for hierarchy queries that you expect to use an underlying database, use the Bio::DB::Taxonomy object instead. It makes sense, but is it true? -hilmar On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote: > Jason Stajich wrote: >> The reason it isn't printing anything is someone didn't really write >> the implementation quite right. This code was overhauled by Sendu >> before the last release I guess something didn't quite get connected. >> >> I checked in code that has the Bio::Taxon delegating now to a DB >> handle for the each_Descendent call. >> You can either patch your code or just use the code listed here: >> http://bioperl.org/wiki/Module:Bio::DB::Taxonomy > > I've reverted that change. > > For some reason the docs for Bio::Taxon::each_Descendent aren't > showing > up on the website, but they state: > > --- > Note that this method never asks the database for the descendents; it > will only return objects you have manually set with add_Descendent > (), or > where this was done for you by making a Bio::Tree::Tree with this > object > as an argument to new(). > > To get the database descendents use > $taxon->db_handle->each_Descendent($taxon). > --- > > > I also have a note in the Synopsis for the module: > > --- > # Though be careful with each_Descendent - unless you add_Descendent() > # yourself, you won't get an answer because unlike for ancestor(), > # Bio::Taxon does not ask the database for the answer. You can ask the > # database yourself using the same method: > ($human) = $homo->db_handle->each_Descendent($homo); > --- > > > This is quite deliberate and is to prevent Bad Things from happening. > (Can't exactly remember the reasoning now, but I know it was good.) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Tue Jun 19 09:05:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca> > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated. From bix at sendu.me.uk Tue Jun 19 10:25:26 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 15:25:26 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> Message-ID: <4677E756.6050200@sendu.me.uk> Hilmar Lapp wrote: > So the real mistake was to write > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; > > instead of > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents > ($node); > > I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the > database? Yes, the database object methods use the database. I don't even think it makes sense to question that. What else would it do? > If this is correct, can we highlight this in the documentation? It's > a small difference that everyone failed to spot. The documentation for what? I've already clearly pointed out the gotcha in Bio::Taxon. > Also, in my reading of Bio::Taxonomy::Taxon it won't use the database > either for ancestor(). Which would be consistent with its other methods. Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing with, and it /does/ use the db to get the ancestor, unless the ancestor is manually set (see below for explanation). > I.e., the bottom line is don't use Node or Taxon objects for > hierarchy queries that you expect to use an underlying database, use > the Bio::DB::Taxonomy object instead. It makes sense, but is it true? Almost. It happens to be true but ideally wouldn't be the case. The confusion and problems arise, I guess, because we have two ways to access/create hierarchies and both of them are built from the same building block (Bio::Taxon objects). On the one hand we have Bio::DB::Taxonomy and the other we have Bio::Tree::Tree. Tree objects are easy: you have a Taxon object created in memory for each and every node in the tree. Each Taxon knows its ancestor and descendants by storing references to the relevant Taxon objects in the tree. You 'navigate' through the tree by grabbing a Taxon inside it and asking the Taxon itself for its ancestor or descendant. This leaves us with the Taxon object having the methods ancestor() and each_Descendent(), which we'll expect to work in other circumstances. Bio::DB::Taxonomy returns single Taxon objects from the database on request. Now we still expect our ancestor() and each_Descendent() methods to work, but if things were set up like Bio::Tree::Tree we'd end up pulling the entire database into memory because we'd have to create all the Taxon objects that are ancestors and descendants, recursively, every time we request a single Taxon (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of Bio::DB::Taxonomy::entrez). The solution? We simply don't create the immediate ancestor or descendant Taxon objects of the requested Taxon, and instead implement the Taxon methods to ask the database to create them on demand, if they don't already exist. Well, that idea is fine (and necessary) for the ancestor method, but we run into problems with each_Descendent(). The problem arises when we create Bio::Tree::Tree objects from a Taxon we got from the database. Being able to do that is why Bio::Taxon is shared between them, as it is a very desirable thing to do: you can instantly create a lineage tree for a Taxon of interest and then use all the Bio::Tree::Tree methods on it. Unfortunately one of those methods is get_nodes() which is implemented using each_Descendent() and get_all_Descendents(). If each_Descendent() asked the database for the real answer, we'd end up pulling the entire database into the tree. So my implementation was to not ask the database and just warn people in the docs. Ideally it /would/ use the database, because that's what a user would expect. Can anyone see an alternate way around the problem? From hlapp at gmx.net Tue Jun 19 12:14:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 12:14:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <4677E756.6050200@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: Sorry I was accidentally looking at an older branch. Reading through the Taxon module I get more confused though than would leave me at ease. Here's what I understand of your description of the problem: - We would like nodes returned from Bio::DB::Taxonomy to use the database for all hierarchical queries. - We would like nodes used in a Bio::Tree::Tree not to use the database for any hierarchical query. What I understand that we have is - Taxon node objects that have a db_handle set will use the database for ancestor(), unless it has been set manually (?), but not for each_Descendent(). - Taxon node objects that don't have a db_handle set won't use a database but will function normally otherwise. - This is needed to prevent Bio::Tree::Tree methods from pulling the entire tree into memory. If this is correct (I'm not sure it is), it sounds like we want to temporarily divorce taxonomy nodes from their database capabilities while they are being queried in a tree context? I'm still trying to understand - if I create a Bio::Tree::Tree from a single node, will the tree automatically contain all nodes along the lineage of ancestors up to the root? So, even if extracting this lineage involved querying a database it would be acceptable, but not for querying descendents? It sounds to me like what is needed is that nodes that get added to a tree need to be stripped of their database capabilities. This could be achieved by creating a wrapper class that delegates all non- hierarchical methods to the wrapped Taxon object, and overriding all hierarchical queries to not use a database. I'm not sure I fully understand yet though, but the inconsistent behavior will be sure to throw people off track. -hilmar On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> So the real mistake was to write >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >> >get_all_Descendents; >> instead of >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $db- >> >get_all_Descendents ($node); >> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask >> the database? > > Yes, the database object methods use the database. I don't even > think it makes sense to question that. What else would it do? > > >> If this is correct, can we highlight this in the documentation? >> It's a small difference that everyone failed to spot. > > The documentation for what? I've already clearly pointed out the > gotcha in Bio::Taxon. > > >> Also, in my reading of Bio::Taxonomy::Taxon it won't use the >> database either for ancestor(). Which would be consistent with >> its other methods. > > Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're > dealing with, and it /does/ use the db to get the ancestor, unless > the ancestor is manually set (see below for explanation). > > >> I.e., the bottom line is don't use Node or Taxon objects for >> hierarchy queries that you expect to use an underlying database, >> use the Bio::DB::Taxonomy object instead. It makes sense, but is >> it true? > > Almost. It happens to be true but ideally wouldn't be the case. The > confusion and problems arise, I guess, because we have two ways to > access/create hierarchies and both of them are built from the same > building block (Bio::Taxon objects). > > On the one hand we have Bio::DB::Taxonomy and the other we have > Bio::Tree::Tree. > > Tree objects are easy: you have a Taxon object created in memory > for each and every node in the tree. Each Taxon knows its ancestor > and descendants by storing references to the relevant Taxon objects > in the tree. You 'navigate' through the tree by grabbing a Taxon > inside it and asking the Taxon itself for its ancestor or descendant. > > This leaves us with the Taxon object having the methods ancestor() > and each_Descendent(), which we'll expect to work in other > circumstances. > > Bio::DB::Taxonomy returns single Taxon objects from the database on > request. Now we still expect our ancestor() and each_Descendent() > methods to work, but if things were set up like Bio::Tree::Tree > we'd end up pulling the entire database into memory because we'd > have to create all the Taxon objects that are ancestors and > descendants, recursively, every time we request a single Taxon > (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and > slow/not allowed in the case of Bio::DB::Taxonomy::entrez). > > The solution? We simply don't create the immediate ancestor or > descendant Taxon objects of the requested Taxon, and instead > implement the Taxon methods to ask the database to create them on > demand, if they don't already exist. Well, that idea is fine (and > necessary) for the ancestor method, but we run into problems with > each_Descendent(). > > The problem arises when we create Bio::Tree::Tree objects from a > Taxon we got from the database. Being able to do that is why > Bio::Taxon is shared between them, as it is a very desirable thing > to do: you can instantly create a lineage tree for a Taxon of > interest and then use all the Bio::Tree::Tree methods on it. > Unfortunately one of those methods is get_nodes() which is > implemented using each_Descendent() and get_all_Descendents(). If > each_Descendent() asked the database for the real answer, we'd end > up pulling the entire database into the tree. > > So my implementation was to not ask the database and just warn > people in the docs. Ideally it /would/ use the database, because > that's what a user would expect. Can anyone see an alternate way > around the problem? -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cain.cshl at gmail.com Tue Jun 19 14:41:52 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Tue, 19 Jun 2007 14:41:52 -0400 Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug? In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL> References: <18039.61086.829726.809888@gargle.gargle.HOWL> Message-ID: <1182278512.2592.42.camel@localhost.localdomain> Hi Alessandra, I cc'ed your message to the bioperl and sequence ontology mailing lists, since your question is relevant to both. Converting genbank files to GFF3 is excruciatingly difficult; I generally find that I can use the genbank2gff3 script to get me most of the way there, but then I need to do some manual fixing to make it 'right'. I am using bioperl-live, since there have been several fixes to the script since bioperl 1.5.2 was released, including the most recent fixes from me today (when I started working on this); I would suggest you use bioperl-live as well. I ran the script on chrY. Most (perhaps all) of the errors fit into a few categories: - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have a phase. Since it can be a little bit of a hassle to calculate, I understand why it was left out, but I'll submit a bug report to have those calculated. If you are planning on loading the GFF file into Chado, you can use the --noCDS option to get exons instead of CDSes, which makes the problem go away (the validator has a bug here though--it reports the polypeptide derives_from mRNA as invalid, but it is correct; I'm reporting that directly to the author). Here's the bioperl bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2322 - "invalid type pair" is caused by the genbank file using feature types in a way that conflicts with the Sequence Ontology. For example, it has STS features that are part_of a gene, pseudogenic_region as part_of pseudogene. I don't know if there would be an easy way to catch this in the conversion script. You may need to fix these by hand. If the problems occur for features that you don't care about, you can use the --filter option to leave them out of the resulting GFF file (for example, adding '--filter STS' would leave all STS features out of the file). Also, if you don't plan on loading these into Chado (which does require SO-compliance) but instead plan on using a Bio::DB::SeqFeature database, these errors may not be a problem. - "invalid type" is caused by feature types that are not in SOFA (Sequence Ontology for Feature Annotation), though the terms probably are in SO. I thought at one point we discussed allowing any SO type to appear in the GFF3 type column, but that is not what the spec says now. I don't see this type of error as causing a problem for either Bio::DB::SeqFeature or Chado. Chado allows features to be typed with anything that is in SO and does not restrict to SOFA. Scott On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote: > Hi all, > > I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about > human genbank file. I used validate_gff3 on line with human.gff and > it has id non-unique so the database gbrowse inserting has errors. > > I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that > I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens > Elements having id non-unique are: > - CDS or pseudo*exon without mRNA and parent > - STS with egual start and end > - tRNA with egual name > > If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl? > If I'm mistaken, can you help me? > > Thanks very much for the help in advance, > > Alessandra. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070619/3d818b27/attachment.bin From sac at bioperl.org Tue Jun 19 14:54:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Tue, 19 Jun 2007 11:54:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <467788C5.6070406@sendu.me.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Valid points, Sendu. I wonder if there might be a best-of-both-worlds approach here. I would not be advocating for a major slice and dice, but just identifying a few large, reasonably well established and encapsulated blocks of functionality that could be managed more independently and segregating them away from the rest. For example: DB, Graphics, Search+SearchIO, Tools. Once per year, we could have a "whole caboodle" release where the core and all sub parts are tested and released as a group, as we currently do. Then, updates to the sub parts can occur as-needed but without necessarily involving updates to other sub parts or the core. The onus would be on the pumpkin for the sub part release to make sure it continues to work with the last whole caboodle release. This would minimize the number of release clashes, since sub part updates would only be sanctioned relative to the last caboodle release, and it would ensure that the whole set continues to interoperate. Perhaps it would be worth experimenting with such an approach so we can judge it based on actual experience. We could identify one functional sub part and segregate it out, do a release cycle or two, along with a sub part release, and decide if this makes things easier or harder, for devs as well as users. We could always bring it back into the fold if it doesn't work out. My fear is that as bioperl continues to grow, the monolithic approach will become increasingly onerous for a single release pumpkin to manage, and harder to find someone who feels up to the task. It could also discourage new developers from diving into the codebase if it looks too deep. And they are our lifeblood. A more functionally segregated bioperl codebase could lower the activation energy needed to recruit release pumpkins and new devs, leading to more release iterations, fewer bugs, more features, and more sustainable growth. When I first discovered Bioperl in 1996, it had three modules. At ~900, I probably wouldn't have joined ranks as a developer (well, I probably would, but it would have taken a while to digest it and become a contributor). Steve On 6/19/07, Sendu Bala wrote: > Steve Chervitz wrote: > > Might this been a good opportunity to investigate partitioning > > bioperl-live into sub-repositories? There has been talk in the past of > > defining a set of "core" modules separate from other functionally > > related groups of modules that would be viewed as optional extensions. > > The goal being to help manage growth and simplify releases. There are > > currently 892 modules under Bio/. > > > > In addition to simplifying the migration to SVN, it would also have > > other benefits. Say some new functionality or a slew of fixes were > > added to Bio::Graphics. We could turn around a new Bio::Graphics > > release quickly without having to work on getting various other parts > > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > > Search etc.). Maintenance and releases of the various extensions would > > be more parallelizable, orchestrated by separate ring leaders. > > > > Over time, as a set of functionality matures, it would see fewer > > updates and there would be less of a need for users to > > download/install/test it. This could make bioperl easier to customize, > > extend, and grok in general. > > > > Long term, it should ease development and release cycles > > I actually take the opposite view. Breaking things up makes testing and > releases more difficult. > > If one person acts as pumpkin for all the sub-parts, his work-load > increases almost linearly with the number of sub-parts. If each sub-part > gets its own pumpkin, where do all these pumpkins come from? It seems to > me that frequently authors will write modules but inevitably their > circumstance changes and they can no longer devote the time to look > after them. Having a single pumpkin and 'forcing' him to make sure > everything works (regardless of his personal interest in the module) > seems more reliable than hoping there will be a person interested enough > in each sub-part to handle its release. > > Since all sub-parts will at the least interact with the 'true' core set > of Bioperl modules, they need to be tested and potentially re-released > every time the true core is updated. And since some sub-parts will > interact with other sub-parts, there will need to be coordinated > joint-testing and release of multiple sub-parts. > > What happens when users report problems? We ask them what version > they're running. Right now '1.5.2' means a specific thing, and its > trivial for someone to confirm the same problem by installing 1.5.2. > What happens when users have to list out all the versions of all the > sub-parts they have? Who is going to consistently recreate a users > hodge-podge of versions in order to confirm a bug? Won't the advice > instead be: "update all versions to the latest and get back to us"? > > So, as I see it, all sub-parts would best be tested and released with a > single new version number every time one sub-part is updated > (significantly). In which case, why have sub-parts at all? Keeping > things the way they are now means ease of release for the pumpkin and > ease of installation for end-users (only one install command to issue to > CPAN). Having 'true' sub-parts (each with its own pumpkin), in my > fatalistic view, is just going to lead to some useful sub-parts being > abandoned and never updated, even where updates may be desirable. > > Each and every Bio:: module could have been released separately by its > respective author. As I see it, one of the main values of 'Bioperl' is > that its one (reasonably) consistent collection of modules that lowers > the barrier of entry for new Bioinformaticians, giving them extremely > easy access to a whole host of functionality with a single install. > From bix at sendu.me.uk Tue Jun 19 15:13:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:13:39 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <46782AE3.2090703@sendu.me.uk> Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. [snip] You haven't convinced me, but I'd go along with the majority decision if best-of-both-worlds was picked. > DB, Graphics, Search+SearchIO, Tools. I will, however, say that DB interleaves into too many core modules. It should stay in core. Tools? Its hardly touched anyway, so I don't see the value of taking it out, what with Bio::Tools::Run already being its own package. Most Bioperl users probably get Bioperl just to do something Blast related, so all Blast stuff really ought to stay in core. Graphics is an obvious choice and I agree. Updated frequently, and has its own release needs. It also has some of the trickier dependencies, so would make installing core simpler. I can imagine plucking Search+SearchIO out, and its something that needs regular updating. Another good candidate. > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. Well, we already have the run package. Its a split-off subpart that gets updated. The only 'experiment' left to do is finding it its own pumpkin. From bix at sendu.me.uk Tue Jun 19 15:48:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:48:50 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: <46783322.30309@sendu.me.uk> Hilmar Lapp wrote: > Here's what I understand of your description of the problem: > > - We would like nodes returned from Bio::DB::Taxonomy to use the > database for all hierarchical queries. > > - We would like nodes used in a Bio::Tree::Tree not to use the > database for any hierarchical query. Correct. > What I understand that we have is > > - Taxon node objects that have a db_handle set will use the database > for ancestor(), unless it has been set manually (?), but not for > each_Descendent(). > > - Taxon node objects that don't have a db_handle set won't use a > database but will function normally otherwise. > > - This is needed to prevent Bio::Tree::Tree methods from pulling the > entire tree into memory. Correct. > If this is correct (I'm not sure it is), it sounds like we want to > temporarily divorce taxonomy nodes from their database capabilities > while they are being queried in a tree context? Yes. > I'm still trying to understand - if I create a Bio::Tree::Tree from a > single node, will the tree automatically contain all nodes along the > lineage of ancestors up to the root? So, even if extracting this > lineage involved querying a database it would be acceptable, but not > for querying descendents? Yes. Asking the database for all the ancestors up to root only pulls a couple of nodes into the tree and is exactly what the user would want to happen. But if nodes are allowed to get their descendants from the database, when we get the root node from the database, we'd get all the root's descendants, and then for each of those we'd get all /their/ descendants... that's when the whole db gets sucked in. > It sounds to me like what is needed is that nodes that get added to a > tree need to be stripped of their database capabilities. This could > be achieved by creating a wrapper class that delegates all non- > hierarchical methods to the wrapped Taxon object, and overriding all > hierarchical queries to not use a database. I'm not sure I fully > understand yet though, but the inconsistent behavior will be sure to > throw people off track. When we're making a tree from a db Taxon we need db access to find all the ancestors; we just don't want to get any descendants outside our initiating Taxon's direct lineage. my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens'); my @ranks = qw(superkingdom class order genus species); my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names, -ranks => \@ranks); @names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus'); $db->add_lineage(-names => \@names, -ranks => \@ranks); my $homo = $db->get_taxon(-name => 'Homo'); isa_ok($homo, 'Bio::Taxon'); # PASS is $homo->ancestor->scientific_name, 'Primates' # PASS my @descs = $homo->each_Descendent; is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node my $lineage = Bio::Tree::Tree->new(-node => $homo); is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS my @nodes = $lineage->get_nodes; ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8 (on that last test I can't remember if the answer might actually be 5 because our lineage does contain 'Homo sapiens') If anyone can figure out how to get all those to pass, please let me know. From cjfields at uiuc.edu Tue Jun 19 17:15:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 16:15:00 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. I would not be advocating for a major slice and dice, > but just identifying a few large, reasonably well established and > encapsulated blocks of functionality that could be managed more > independently and segregating them away from the rest. For example: > DB, Graphics, Search+SearchIO, Tools. There should also be a consensus between the core devs on this; I don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing their opinions as it will directly impact projects which rely on core functionality (GBrowse/GMOD, bioperl-db, etc). I also agree with George that this should be postponed until after svn issues are taken care of. Stating that, I think this is a good idea in general, though we'll need to be careful which ones we segregate out as non-core. I agree with your choices; I would add in Bio::Restriction, Bio::Assembly, Bio::Structure, and a few more. As long as the distribution required installation of 'core' prior to test runs it shouldn't be too much of a problem. In order for this to work we would need to delineate what defines 'core' (how broad the definition should be), then identify those modules that don't fit and decide what to do with them. Would we want to split the others into separate packages or lump together as a bioperl-auxiliary (horrid name, but you get my point)? Too many could be a logistical nightmare, as Sendu has pointed out. > Once per year, we could have a "whole caboodle" release where the core > and all sub parts are tested and released as a group, as we currently > do. Then, updates to the sub parts can occur as-needed but without > necessarily involving updates to other sub parts or the core. Sounds fine by me. Actually, my thought was we could reimplement Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted) to install all the necessary subpackages in order to emulate an old- style 'core' installation, or act as an 'install everything BioPerl- related' Bundle. Regular updates of the subpackages to CPAN should just require updating the Bundle (which would update only the relevant parts, at least I believe it would). > The onus would be on the pumpkin for the sub part release to make sure > it continues to work with the last whole caboodle release. This would > minimize the number of release clashes, since sub part updates would > only be sanctioned relative to the last caboodle release, and it would > ensure that the whole set continues to interoperate. > > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. We could always bring it back > into the fold if it doesn't work out. > > My fear is that as bioperl continues to grow, the monolithic approach > will become increasingly onerous for a single release pumpkin to > manage, and harder to find someone who feels up to the task. It could > also discourage new developers from diving into the codebase if it > looks too deep. And they are our lifeblood. Agreed! > A more functionally segregated bioperl codebase could lower the > activation energy needed to recruit release pumpkins and new devs, > leading to more release iterations, fewer bugs, more features, and > more sustainable growth. 'Activation energy.' Hmm. Spoken like a true biologist. > When I first discovered Bioperl in 1996, it had three modules. At > ~900, I probably wouldn't have joined ranks as a developer (well, I > probably would, but it would have taken a while to digest it and > become a contributor). > > Steve I pretty much agree, though this will require quite a bit more discussion. chris From hlapp at gmx.net Tue Jun 19 17:57:54 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 17:57:54 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > There should also be a consensus between the core devs on this; I > don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing > their opinions The problem I have increasingly had with BioPerl (aside from the fact that it's written in Perl ;) is the plethora of dependencies I need to install, not the number of modules. But every time I've been told that that's what Perl is all about, and I should shut up and install the bundle. Idiosyncratically I don't like bundles that clutter up my hard disk with stuff I'll never use, and in this sense if BioPerl is divided into 10 packages I will have to think about each one whether I need it, and do a separate CVS checkout - and regular update - of each one (though granted, I believe there are ways the multiple checkout and update thing can be taken care of). In reality, this may be a rapidly disappearing trait though of those who have grown up in a time when they proudly spent all their savings to buy that new computer because it had a 20MB hard disk, compared to the two 360k floppy drives the previous one had. So don't ask me, just don't make it too hard for the dinosaurs. > as it will directly impact projects which rely on core > functionality (GBrowse/GMOD, bioperl-db, etc). Well, I hope there are ways to limit that? > I also agree with George that this should be postponed until after > svn issues are taken care of. I agree entirely. Please don't throw this in the same bin or tie one to the other. The migration is neither easier nor faster nor better testable with a partitioned BioPerl. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Jun 19 21:48:20 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 20:48:20 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > >> There should also be a consensus between the core devs on this; I >> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >> their opinions > > The problem I have increasingly had with BioPerl (aside from the fact > that it's written in Perl ;) is the plethora of dependencies I need > to install, not the number of modules. > > But every time I've been told that that's what Perl is all about, and > I should shut up and install the bundle. Idiosyncratically I don't > like bundles that clutter up my hard disk with stuff I'll never use, > and in this sense if BioPerl is divided into 10 packages I will have > to think about each one whether I need it, and do a separate CVS > checkout - and regular update - of each one (though granted, I > believe there are ways the multiple checkout and update thing can be > taken care of). I agree; the fewer dependencies the better. We could divide it up into a small, focused core package with only a few dependencies, and 1-3 more containing the focused bits which require the most maintenance (Graphics, SearchIO/Tools, etc). I worry about having too many more. > In reality, this may be a rapidly disappearing trait though of those > who have grown up in a time when they proudly spent all their savings > to buy that new computer because it had a 20MB hard disk, compared to > the two 360k floppy drives the previous one had. > > So don't ask me, just don't make it too hard for the dinosaurs. There would need to be some way of getting an old-style full-blown core installation regardless of how many subdistros we would divy core up into. My thought for CPAN was having Bundle::BioPerl take over this but I'm not sure if it's still being used. Maybe there are other ways for svn/cvs. >> as it will directly impact projects which rely on core >> functionality (GBrowse/GMOD, bioperl-db, etc). > > Well, I hope there are ways to limit that? I believe so, yes, particularly for bioperl-db. I would think splitting off Bio::Graphics or Bio::DB* will have some effect on GBrowse/GFF. >> I also agree with George that this should be postponed until after >> svn issues are taken care of. > > I agree entirely. Please don't throw this in the same bin or tie one > to the other. The migration is neither easier nor faster nor better > testable with a partitioned BioPerl. > > -hilmar We def. have to complete transition to subversion first, then think about this some more. chris From n.haigh at sheffield.ac.uk Wed Jun 20 02:31:24 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 07:31:24 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: <4678C9BC.10206@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > >> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: >> >>> There should also be a consensus between the core devs on this; I >>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >>> their opinions >> The problem I have increasingly had with BioPerl (aside from the fact >> that it's written in Perl ;) is the plethora of dependencies I need >> to install, not the number of modules. >> >> But every time I've been told that that's what Perl is all about, and >> I should shut up and install the bundle. Idiosyncratically I don't >> like bundles that clutter up my hard disk with stuff I'll never use, >> and in this sense if BioPerl is divided into 10 packages I will have >> to think about each one whether I need it, and do a separate CVS >> checkout - and regular update - of each one (though granted, I >> believe there are ways the multiple checkout and update thing can be >> taken care of). > > I agree; the fewer dependencies the better. We could divide it up > into a small, focused core package with only a few dependencies, and > 1-3 more containing the focused bits which require the most > maintenance (Graphics, SearchIO/Tools, etc). I worry about having > too many more. > >> In reality, this may be a rapidly disappearing trait though of those >> who have grown up in a time when they proudly spent all their savings >> to buy that new computer because it had a 20MB hard disk, compared to >> the two 360k floppy drives the previous one had. >> >> So don't ask me, just don't make it too hard for the dinosaurs. > > There would need to be some way of getting an old-style full-blown > core installation regardless of how many subdistros we would divy > core up into. My thought for CPAN was having Bundle::BioPerl take > over this but I'm not sure if it's still being used. Maybe there are > other ways for svn/cvs. Personally, I think this use of Bundle::Bioperl is more in line with what CPAN Bundles were meant to do - "a bundle is a collection of modules that comprise a cohesive unit". Under that definition you could probably put the whole of Bioperl but I won't go there! When a package is updated and a new release is made, this should be installable/updatable via cpan as well as updating the bundle with the correct version. This was you can get all of Bioperl via the bundle, or just install the sub-packages on their own. If the switch over to svn takes place, will all the Bioperl-* projects move over at the same time? If so, will they go into their own svn repository or into the same one? Since with svn you can checkout any subtree of the repository I'm not clear on the pro's and cons of either of these options. Am I right in thinking that there is a way for cvs to define a "project" such that when you checkout that "project" it actually checks out multiple projects behind the scene? I'm sure I've seen this somewhere, possibly when the project is dependent on some 3rd party code that is also in cvs. If this is possible, I'm sure it will also be possible with svn. This could then allow something like the following to happen after the split up of Bioperl. The following projects could be defined: bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" called "bioperl" would actually checkout the real projects call bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems that this ought to be possible, doesn't it? > >>> as it will directly impact projects which rely on core >>> functionality (GBrowse/GMOD, bioperl-db, etc). >> Well, I hope there are ways to limit that? > > I believe so, yes, particularly for bioperl-db. I would think > splitting off Bio::Graphics or Bio::DB* will have some effect on > GBrowse/GFF. > >>> I also agree with George that this should be postponed until after >>> svn issues are taken care of. >> I agree entirely. Please don't throw this in the sam. e bin or tie one >> to the other. The migration is neither easier nor faster nor better >> testable with a partitioned BioPerl. >> >> -hilmar > > We def. have to complete transition to subversion first, then think > about this some more. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4 op9sQTZyeK6G6taFhTAPMYc= =7NRw -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 07:46:16 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 07:46:16 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? They are under the same CVSROOT right now. Locking down some sub- repositories but not others may be odd or impossible. > If so, will they go into their own svn repository or into the same > one? Good question, I'm not sure about the pros and cons one way or the other either. The fewer repositories the less sysadmin work in fine- graining permissions. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8 Ims4d150lsX0vXtDwGI1lKg= =K4++ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Wed Jun 20 07:57:22 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 12:57:22 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> Message-ID: <46791622.6080409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > >> If the switch over to svn takes place, will all the Bioperl-* projects >> move over at the same time? > > They are under the same CVSROOT right now. Locking down some > sub-repositories but not others may be odd or impossible. > >> If so, will they go into their own svn repository or into the same one? > > Good question, I'm not sure about the pros and cons one way or the other > either. The fewer repositories the less sysadmin work in fine-graining > permissions. > > -hilmar > I don't think there is any major reason why the following single repos wouldn't do the trick: /-- |-bioperl-live | |--- trunk | |--- branches | |--- tags | |-bioperl-run |--- trunk |--- branches |--- tags Any reason why this couldn't be used? I know some people don't like the idea of the revision number incrementing for the whole repository if it contains several "projects". However, revision numbers are really only a way for svn to keep track of things and a very large revision number shouldn't really "upset" anyone. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA 1Vj8BSUnanpdjYYLE6eGanU= =bOqK -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 08:08:33 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 08:08:33 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? That would work fine except that there are several more sub-projects (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). That should still be fine. I think what needs to be recognized is the limitations it puts on permission granularity. If it's all the same repository (as is now) then having commit rights to one (subproject) will mean commit rights to all. From my perspective that's fine, it has worked great so far. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1 hckjT7LBtHcmwGI8B+BKQIM= =gYfA -----END PGP SIGNATURE----- From hartzell at alerce.com Tue Jun 19 15:53:39 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 19 Jun 2007 12:53:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <18040.13379.217277.992742@almost.alerce.com> Steve Chervitz writes: > On 6/16/07, Jason Stajich wrote: > > [...] > > Just to say I already went through all the steps of running cvs2svn > > myself and had problems gathering back out the branches and all the > > tags when I tried it. If you want to start with a smaller repository > > like bioperl-network or bioperl-db as the initial cvs2svn conversion > > script took quite a long time to run on bioperl-live. > > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? [...] I'd say that the time to do this kind of rearrangement would be *after* the svn repo's set up. That way you'll be able to track stuff back through to the beginning of time. g. From sdavis2 at mail.nih.gov Wed Jun 20 08:44:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 20 Jun 2007 08:44:08 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <46792118.4030205@mail.nih.gov> Hilmar Lapp wrote: > > On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > >> I don't think there is any major reason why the following single repos >> wouldn't do the trick: > >> /-- >> |-bioperl-live >> | |--- trunk >> | |--- branches >> | |--- tags >> | >> |-bioperl-run >> |--- trunk >> |--- branches >> |--- tags > >> Any reason why this couldn't be used? > > That would work fine except that there are several more sub-projects > (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). > > That should still be fine. I think what needs to be recognized is the > limitations it puts on permission granularity. If it's all the same > repository (as is now) then having commit rights to one (subproject) > will mean commit rights to all. From my perspective that's fine, it > has worked great so far. Actually, I think there are ways of creating per-directory access control. See here: http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general With Apache-based https access, such access control is relatively straightforward, it appears. With the standalone svn server over ssh, one needs to use "commit hook scripts" to limit access. But I think it is possible (admitting that I have not tried to do this...). Sean From hartzell at alerce.com Wed Jun 20 09:23:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:23:32 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <18041.10836.728079.835572@almost.alerce.com> Nathan S. Haigh writes: > [...] > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? If so, will they go into their own svn > repository or into the same one? Since with svn you can checkout any > subtree of the repository I'm not clear on the pro's and cons of either > of these options. I'm planning to drop the projects from the top of the CVSROOT into a single svn repository: bioperl-ext bioperl-pipeline biodata bioperl-gui bioperl-run bioperl-cookbook bioperl-live biosql-schema bioperl-corba-client bioperl-microarray html bioperl-corba-server bioperl-network task-manager bioperl-das-client bioperl-papers xml-html bioperl-db bioperl-pedigree although that's open to feedback from the core members. As a progress report, I've built a demo repos with -run, -ext, and -live in it and asked a couple of folks to to take a peek at it. When I get a bit further along I'll figure out how to get something for the public to test. > Am I right in thinking that there is a way for cvs to define a "project" > such that when you checkout that "project" it actually checks out > multiple projects behind the scene? I'm sure I've seen this somewhere, > possibly when the project is dependent on some 3rd party code that is > also in cvs. If this is possible, I'm sure it will also be possible with > svn. This could then allow something like the following to happen after > the split up of Bioperl. The following projects could be defined: > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > called "bioperl" would actually checkout the real projects call > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > that this ought to be possible, doesn't it? > [...] I don't think that there's any functionality like that in svn. g. From hartzell at alerce.com Wed Jun 20 09:26:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:26:04 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <18041.10988.375946.833182@almost.alerce.com> Nathan S. Haigh writes: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hilmar Lapp wrote: > > > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > > > >> If the switch over to svn takes place, will all the Bioperl-* projects > >> move over at the same time? > > > > They are under the same CVSROOT right now. Locking down some > > sub-repositories but not others may be odd or impossible. > > > >> If so, will they go into their own svn repository or into the same one? > > > > Good question, I'm not sure about the pros and cons one way or the other > > either. The fewer repositories the less sysadmin work in fine-graining > > permissions. > > > > -hilmar > > > > > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? > [...] That's exactly the way that I'm setting it up. g. From n.haigh at sheffield.ac.uk Wed Jun 20 09:33:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 14:33:33 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <18041.10836.728079.835572@almost.alerce.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <18041.10836.728079.835572@almost.alerce.com> Message-ID: <46792CAD.5060700@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Nathan S. Haigh writes: > > [...] > > If the switch over to svn takes place, will all the Bioperl-* projects > > move over at the same time? If so, will they go into their own svn > > repository or into the same one? Since with svn you can checkout any > > subtree of the repository I'm not clear on the pro's and cons of either > > of these options. > > I'm planning to drop the projects from the top of the CVSROOT into a > single svn repository: > > bioperl-ext bioperl-pipeline biodata bioperl-gui > bioperl-run bioperl-cookbook bioperl-live biosql-schema > bioperl-corba-client bioperl-microarray html bioperl-corba-server > bioperl-network task-manager bioperl-das-client bioperl-papers > xml-html bioperl-db bioperl-pedigree > > although that's open to feedback from the core members. > > As a progress report, I've built a demo repos with -run, -ext, and > -live in it and asked a couple of folks to to take a peek at it. When > I get a bit further along I'll figure out how to get something for the > public to test. Could I take a peek?? > > > Am I right in thinking that there is a way for cvs to define a "project" > > such that when you checkout that "project" it actually checks out > > multiple projects behind the scene? I'm sure I've seen this somewhere, > > possibly when the project is dependent on some 3rd party code that is > > also in cvs. If this is possible, I'm sure it will also be possible with > > svn. This could then allow something like the following to happen after > > the split up of Bioperl. The following projects could be defined: > > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > > called "bioperl" would actually checkout the real projects call > > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > > that this ought to be possible, doesn't it? > > [...] > > I don't think that there's any functionality like that in svn. I did come across this which might help: http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561 Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su sWDAmqFhGgtlyeawaIGSV14= =zeAY -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 20 11:38:20 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 20 Jun 2007 16:38:20 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm Message-ID: <467949EC.9040100@sendu.me.uk> In considering updating all the test scripts to take advantage of the new network option, and/or reimplementing them in Test::More, I thought now would be a good time to standardize all the test scripts and reduce the possibility of having to alter them all in the future if something changes. For example we could decide on an alternate way of choosing to run network tests, or a new way of deciding to output debug information. There are also some inconsistencies in the messages produced by tests skipping all, and even an unfortunate mistake that has been copy/pasted through a lot of test scripts. My solution is t/lib/BioperlTest.pm (documented with perldoc) We go from this: ---- use strict; our $DEBUG; BEGIN { $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; eval { require Test::More; }; if( $@ ) { use lib 't/lib'; } use Test::More; # the mistake! use Module::Build; my $build = Module::Build->current(); my $do_network_tests = $build->notes('network'); eval { require IO::String; require LWP; require LWP::UserAgent; }; if ($@) { plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed. This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests'; } elsif (!$do_network_tests) { plan skip_all => 'Network tests have not been requested, skipping all'; } else { plan tests => 21; } #... } my $obj = Bio::Object->new(-verbose => $DEBUG); #... ---- To this: ---- use strict; BEGIN { use lib 't/lib'; use BioperlTest; test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)], -requires_networking => 1, -tests => 21); #... } my $obj = Bio::Object->new(-verbose => test_debug()); #... ---- Can anyone identify problems with this approach? Is the interface presented by BioperlTest flexible enough that any changes would only be additions for new functionality (and therefore all test scripts wouldn't need to be altered)? Is BioperlTest missing anything you'd like? Are there any objections to me updating all tests in this manner? For an example, see t/RemoteBlast.t Cheers, Sendu. From spiros at lokku.com Wed Jun 20 11:49:48 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Wed, 20 Jun 2007 16:49:48 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> Message-ID: Yep, they are not all done. Some still need to be ported over, doing some here and there at home. However, the recent email Sendu sent, the one about abstracting the setup of testing is actually something i was thinking myself so it might be a better way to tackle the problem. For once it would save us from duplicating the same 30 lines of code across all tests. As far as network tests are involved, ive always been an avid hater of them. I believe they only bring more troubles than what they contribute due to the diversity of setups people have. My way of tackling them was always to group all the tests that required live access into one file and then forcibly just run that - iff needed and not by default. Like i said, thats just my opinion, ive been bitten by them one time too many. Spiros On 6/18/07, Chris Fields wrote: > > On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > > > Chris Fields wrote: > >> Couldn't you enable BIOPERLDEBUG, disable network access, then > >> iterate through tests checking for those which fail or skip? > > > > Yes, good idea, though my dev machine is also my email/webserver so > > I'd rather come up with an alternate solution than one involving > > 'disable network access'. > > > > Still, that's what I'll probably end up doing. Cheers! > > > > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > > to wait for you to finish, or join in? If you're not going to have > > time to do any more in the next few weeks, can you please update > > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > > in the opposite case, add your name in)? Its not quite clear to me > > which tests are assigned to whom. Can someone clarify what the > > markings mean? > > > > Cheers, > > Sendu. > > Not sure how far along spiros is; I handed it over after I finished > up to the 'Q' tests. In general the ones marked out have been > converted over, ones with names next to them have been claimed. If > you need help I'll prob. start back up again to finish them off; we > just need to divy them up. > > chris > From hlapp at gmx.net Wed Jun 20 12:27:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 12:27:47 -0400 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: Very cool! Sounds like a no-brainer to me to adopt this in all the tests. -hilmar On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > In considering updating all the test scripts to take advantage of the > new network option, and/or reimplementing them in Test::More, I > thought > now would be a good time to standardize all the test scripts and > reduce > the possibility of having to alter them all in the future if something > changes. > > For example we could decide on an alternate way of choosing to run > network tests, or a new way of deciding to output debug information. > There are also some inconsistencies in the messages produced by tests > skipping all, and even an unfortunate mistake that has been copy/ > pasted > through a lot of test scripts. > > My solution is t/lib/BioperlTest.pm (documented with perldoc) > > We go from this: > > ---- > use strict; > our $DEBUG; > > BEGIN { > $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > > eval { require Test::More; }; > if( $@ ) { > use lib 't/lib'; > } > use Test::More; # the mistake! > > use Module::Build; > my $build = Module::Build->current(); > my $do_network_tests = $build->notes('network'); > > eval { > require IO::String; > require LWP; > require LWP::UserAgent; > }; > if ($@) { > plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > installed. > This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > tests'; > } > elsif (!$do_network_tests) { > plan skip_all => 'Network tests have not been requested, skipping > all'; > } > else { > plan tests => 21; > } > > #... > } > > my $obj = Bio::Object->new(-verbose => $DEBUG); > #... > ---- > > To this: > > ---- > use strict; > > BEGIN { > use lib 't/lib'; > use BioperlTest; > > test_begin(-requires_modules => [qw(IO::String LWP > LWP::UserAgent)], > -requires_networking => 1, > -tests => 21); > > #... > } > > my $obj = Bio::Object->new(-verbose => test_debug()); > #... > ---- > > > Can anyone identify problems with this approach? Is the interface > presented by BioperlTest flexible enough that any changes would > only be > additions for new functionality (and therefore all test scripts > wouldn't > need to be altered)? Is BioperlTest missing anything you'd like? > > Are there any objections to me updating all tests in this manner? > For an > example, see t/RemoteBlast.t > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 20 12:44:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 11:44:01 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: Agreed! You've already created an example case so there's something to go off of. I plan on changing some EUtilities tests soon so I'll try implementing this, basing off your RemoteBlast.t implementation. Seems clear enough on the surface; if I run into problems I'll post. chris On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > Very cool! Sounds like a no-brainer to me to adopt this in all the > tests. -hilmar > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > >> In considering updating all the test scripts to take advantage of the >> new network option, and/or reimplementing them in Test::More, I >> thought >> now would be a good time to standardize all the test scripts and >> reduce >> the possibility of having to alter them all in the future if >> something >> changes. >> >> For example we could decide on an alternate way of choosing to run >> network tests, or a new way of deciding to output debug information. >> There are also some inconsistencies in the messages produced by tests >> skipping all, and even an unfortunate mistake that has been copy/ >> pasted >> through a lot of test scripts. >> >> My solution is t/lib/BioperlTest.pm (documented with perldoc) >> >> We go from this: >> >> ---- >> use strict; >> our $DEBUG; >> >> BEGIN { >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; >> >> eval { require Test::More; }; >> if( $@ ) { >> use lib 't/lib'; >> } >> use Test::More; # the mistake! >> >> use Module::Build; >> my $build = Module::Build->current(); >> my $do_network_tests = $build->notes('network'); >> >> eval { >> require IO::String; >> require LWP; >> require LWP::UserAgent; >> }; >> if ($@) { >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot >> installed. >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping >> tests'; >> } >> elsif (!$do_network_tests) { >> plan skip_all => 'Network tests have not been requested, >> skipping >> all'; >> } >> else { >> plan tests => 21; >> } >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => $DEBUG); >> #... >> ---- >> >> To this: >> >> ---- >> use strict; >> >> BEGIN { >> use lib 't/lib'; >> use BioperlTest; >> >> test_begin(-requires_modules => [qw(IO::String LWP >> LWP::UserAgent)], >> -requires_networking => 1, >> -tests => 21); >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => test_debug()); >> #... >> ---- >> >> >> Can anyone identify problems with this approach? Is the interface >> presented by BioperlTest flexible enough that any changes would >> only be >> additions for new functionality (and therefore all test scripts >> wouldn't >> need to be altered)? Is BioperlTest missing anything you'd like? >> >> Are there any objections to me updating all tests in this manner? >> For an >> example, see t/RemoteBlast.t >> >> >> Cheers, >> Sendu. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From wollenbergk at mail.nih.gov Wed Jun 20 14:11:04 2007 From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID)) Date: Wed, 20 Jun 2007 14:11:04 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others Message-ID: Greetings: I am working on a script to take a list of sequence IDs, extract the sequences from GenPept, and then run a BLAST search for each of the retrieved sequences. I am having a problem with the sequence retrieval, where some sequences are found and others are not and it's not obvious to me why this is. For example, using a text file containing the two following IDs as input: SKG3_YEAST NEM1_YEAST My script while( ) { chomp; my $seqid = $_; my $seq_obj = get_sequence( 'genpept', $seqid ); } will create a sequence object for the first ID, (print "Accession of ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession number) but for the second I am told -------------------- WARNING --------------------- MSG: id (NEM1_YEAST) does not exist --------------------------------------------------- When I pull up these records using the Entrez cross-databse search in my web browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using these search terms). In both records these IDs reside in the same field ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one but not the other. Any advice would be greatly appreciated. Cheers, Kurt Wollenberg, Ph.D. Phylogenetics and Sequence Analysis Consultant Biocomputing Research Consulting Section Bioinformatics and Scientific IT Program (BSIP) NIH/NIAID/OTIS Contractor, Lockheed Martin http://bioinformatics.niaid.nih.gov Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives. From bosborne11 at verizon.net Wed Jun 20 14:59:39 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 20 Jun 2007 14:59:39 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: Message-ID: Kurt, I can't answer your question but I wouldn't use Bio::Perl myself, I'd use Bio::DB::GenPept: 501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq = $db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;' MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~> It's true that Bio::Perl is easy-to-use but it's also _very_ limited. Brian O. On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)" wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence retrieval, > where some sequences are found and others are not and it's not obvious to me > why this is. > > For example, using a text file containing the two following IDs as input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using > these search terms). In both records these IDs reside in the same field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is confidential > and may contain sensitive information. It should not be used by anyone who > is not the original intended recipient. If you have received this e-mail in > error please inform the sender and delete it from your mailbox or any other > storage devices. National Institute of Allergy and Infectious Diseases shall > not accept liability for any statements made that are sender's own and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Jun 20 16:11:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 15:11:34 -0500 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: References: Message-ID: I'm assuming you are using the Bio::Perl exported sub get_sequence (). I am able to reproduce the issue using bioperl-live; it's an odd issue as direct use of Bio::DB::GenPept works fine: use Bio::DB::GenPept; my $factory = Bio::DB::GenPept->new(); my @accs = qw(SKG3_YEAST NEM1_YEAST); my $io = $factory->get_Stream_by_acc(\@accs); while (my $seq = $io->next_seq) { print "Accession:",$seq->accession,"\n"; } chris On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence > retrieval, > where some sequences are found and others are not and it's not > obvious to me > why this is. > > For example, using a text file containing the two following IDs as > input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct > accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search > in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST > (using > these search terms). In both records these IDs reside in the same > field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence > finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is > confidential > and may contain sensitive information. It should not be used by > anyone who > is not the original intended recipient. If you have received this e- > mail in > error please inform the sender and delete it from your mailbox or > any other > storage devices. National Institute of Allergy and Infectious > Diseases shall > not accept liability for any statements made that are sender's own > and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sac at bioperl.org Thu Jun 21 02:32:47 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 20 Jun 2007 23:32:47 -0700 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com> Looks like a nice refactor. After it's in place, don't forget to update the wiki: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Steve On 6/20/07, Chris Fields wrote: > Agreed! You've already created an example case so there's something > to go off of. > > I plan on changing some EUtilities tests soon so I'll try > implementing this, basing off your RemoteBlast.t implementation. > Seems clear enough on the surface; if I run into problems I'll post. > > chris > > On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > > > Very cool! Sounds like a no-brainer to me to adopt this in all the > > tests. -hilmar > > > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > > > >> In considering updating all the test scripts to take advantage of the > >> new network option, and/or reimplementing them in Test::More, I > >> thought > >> now would be a good time to standardize all the test scripts and > >> reduce > >> the possibility of having to alter them all in the future if > >> something > >> changes. > >> > >> For example we could decide on an alternate way of choosing to run > >> network tests, or a new way of deciding to output debug information. > >> There are also some inconsistencies in the messages produced by tests > >> skipping all, and even an unfortunate mistake that has been copy/ > >> pasted > >> through a lot of test scripts. > >> > >> My solution is t/lib/BioperlTest.pm (documented with perldoc) > >> > >> We go from this: > >> > >> ---- > >> use strict; > >> our $DEBUG; > >> > >> BEGIN { > >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > >> > >> eval { require Test::More; }; > >> if( $@ ) { > >> use lib 't/lib'; > >> } > >> use Test::More; # the mistake! > >> > >> use Module::Build; > >> my $build = Module::Build->current(); > >> my $do_network_tests = $build->notes('network'); > >> > >> eval { > >> require IO::String; > >> require LWP; > >> require LWP::UserAgent; > >> }; > >> if ($@) { > >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > >> installed. > >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > >> tests'; > >> } > >> elsif (!$do_network_tests) { > >> plan skip_all => 'Network tests have not been requested, > >> skipping > >> all'; > >> } > >> else { > >> plan tests => 21; > >> } > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => $DEBUG); > >> #... > >> ---- > >> > >> To this: > >> > >> ---- > >> use strict; > >> > >> BEGIN { > >> use lib 't/lib'; > >> use BioperlTest; > >> > >> test_begin(-requires_modules => [qw(IO::String LWP > >> LWP::UserAgent)], > >> -requires_networking => 1, > >> -tests => 21); > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => test_debug()); > >> #... > >> ---- > >> > >> > >> Can anyone identify problems with this approach? Is the interface > >> presented by BioperlTest flexible enough that any changes would > >> only be > >> additions for new functionality (and therefore all test scripts > >> wouldn't > >> need to be altered)? Is BioperlTest missing anything you'd like? > >> > >> Are there any objections to me updating all tests in this manner? > >> For an > >> example, see t/RemoteBlast.t > >> > >> > >> Cheers, > >> Sendu. > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From staffa at niehs.nih.gov Thu Jun 21 14:36:12 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Thu, 21 Jun 2007 14:36:12 -0400 Subject: [Bioperl-l] BIO::DB::FASTA ID Message-ID: This program below returns only 1527 IDs from a fasta file that I have constructed, which has mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa 1820 . It actually does not return the first 3 ids, nor the 5th, nor 7..36, 38,39,41..44...... The header lines are of variable length and the sequence lines are 80 characters except at the ends when they might be shorter. Is there some caveat that I am ignoring in my format that breaks bio::db::fasta? #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; $|=1; # # my $Dpse_UTR_file_for_T_orthologs = "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; my $db = Bio::DB::Fasta->new ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', -reindex, -makeid => \&make_my_id); my @ids = $db->ids; my $number_in = @ids; print "number of Dpse IDs = $number_in\n"; foreach my $id (@ids){ print "$id\n"; } sub make_my_id { # parse header line: # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT my $line = shift; # print "line = $line\n"; $line =~ />(\w+) /; my $ID = $1; # print "ID = $ID\n"; return $ID; } -------------- next part -------------- A non-text attachment was scrubbed... Name: T_orthologs_Dpse_genes.fa Type: application/octet-stream Size: 5033676 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070621/07c354d0/attachment-0001.obj From jason at bioperl.org Thu Jun 21 17:19:14 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 21 Jun 2007 14:19:14 -0700 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: Hey Nick - I think a) your IDs are not unique b) you need to declare the function make_my_id BEFORE your call Bio::DB::Fasta->new if you want your function to be used. $ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort | uniq | wc -l 1527 -jason On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 > TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From mkiwala at watson.wustl.edu Thu Jun 21 17:23:46 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Thu, 21 Jun 2007 16:23:46 -0500 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: <467AEC62.2040508@watson.wustl.edu> You only have 1527 unique id's in the file. ~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\ -f1|sort -u|wc -l 1527 Change your make_id function to make sure the id's are unique. Staffa, Nick (NIH/NIEHS) wrote: > This program below returns only 1527 IDs from a fasta file that I have > constructed, which has > mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa > 1820 > . > It actually does not return the first 3 ids, > nor the 5th, nor 7..36, 38,39,41..44...... > The header lines are of variable length and the sequence lines are 80 > characters except at the ends when they might be shorter. > Is there some caveat that I am ignoring in my format that breaks > bio::db::fasta? > > > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Jun 25 09:06:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:06:27 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: <467FBDD3.8050009@sendu.me.uk> Sendu Bala wrote: > In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm I'm now in the process of converting all test scripts. In addition to those things mentioned previously, BioperlTest now also provides the methods test_input_file() and test_output_file(). This: ---- use Bio::Root::IO; my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); $obj->new(-file => ">$output_file"); END { unlink($output_file); } ... $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); ---- Becomes this: ---- my $output_file = test_output_file(); $obj->new(-file => ">$output_file"); ... $obj->new(-file => test_input_file('input.file')); ---- I should think the benefits are obvious, especially for the output files, which thanks to inconsistency of using END blocks correctly or at all, leaves some output data behind on occasion. test_input_file() is helpful for the shorthand, but also gets rid of many tests' usage of Bio::Root::IO (relying on something you're installing and testing in another test script to work in the current test script, without testing it in your own test script seems like a no-no to me). From cjfields at uiuc.edu Mon Jun 25 09:39:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:39:21 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] t/lib/ >> BioperlTest.pm > > I'm now in the process of converting all test scripts. In addition to > those things mentioned previously, BioperlTest now also provides the > methods test_input_file() and test_output_file(). > > > This: > ---- > use Bio::Root::IO; > my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); > $obj->new(-file => ">$output_file"); > > END { > unlink($output_file); > } > > ... > > $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); > ---- > > > Becomes this: > ---- > my $output_file = test_output_file(); > $obj->new(-file => ">$output_file"); > > ... > > $obj->new(-file => test_input_file('input.file')); > ---- > > > I should think the benefits are obvious, especially for the output > files, which thanks to inconsistency of using END blocks correctly > or at > all, leaves some output data behind on occasion. Sounds fine by me, though it's a lot of work. BTW, did we ever decide whether to finish up with Test::More conversion? I haven't heard back yet; let me know what you want to do. > test_input_file() is helpful for the shorthand, but also gets rid of > many tests' usage of Bio::Root::IO (relying on something you're > installing and testing in another test script to work in the current > test script, without testing it in your own test script seems like a > no-no to me). Well, in a way isn't that itself a test of the class (whether it breaks or not)? ; > Do test_input_file() and test_input_file() handle directory structures in an OS-safe way like catfile()? For instance, I plan on adding test data to a new directory similar to Bio::Graphics (t/data/ eutil) to prevent cluttering of the t/data directory. I could use '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base directory is 't/data' but that may not be cross-platform compatible with win32 file systems, which may still expect something like 't\data \eutil\input.xml'. chris From bix at sendu.me.uk Mon Jun 25 09:45:23 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:45:23 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> Message-ID: <467FC6F3.6080705@sendu.me.uk> Chris Fields wrote: > On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >> I should think the benefits are obvious, especially for the output >> files, which thanks to inconsistency of using END blocks correctly or at >> all, leaves some output data behind on occasion. > > Sounds fine by me, though it's a lot of work. BTW, did we ever decide > whether to finish up with Test::More conversion? I haven't heard back > yet; let me know what you want to do. I'm doing the remaining Test::More conversions at the same time. > Do test_input_file() and test_input_file() handle directory structures > in an OS-safe way like catfile()? For instance, I plan on adding test > data to a new directory similar to Bio::Graphics (t/data/eutil) to > prevent cluttering of the t/data directory. I could use > '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base > directory is 't/data' but that may not be cross-platform compatible with > win32 file systems, which may still expect something like > 't\data\eutil\input.xml'. Its platform-independent, currently implemented using File::Spec. So you'll say: $obj->new(-file => test_input_file('eutil', 'input.xml')); Its all documented in the POD of BioperlTest. From cjfields at uiuc.edu Mon Jun 25 09:49:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:49:51 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FC6F3.6080705@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> <467FC6F3.6080705@sendu.me.uk> Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu> On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >>> I should think the benefits are obvious, especially for the output >>> files, which thanks to inconsistency of using END blocks >>> correctly or at >>> all, leaves some output data behind on occasion. >> Sounds fine by me, though it's a lot of work. BTW, did we ever >> decide whether to finish up with Test::More conversion? I haven't >> heard back yet; let me know what you want to do. > > I'm doing the remaining Test::More conversions at the same time. Okay. Just didn't want to do any redundant work if it's already being/been done. >> Do test_input_file() and test_input_file() handle directory >> structures in an OS-safe way like catfile()? For instance, I plan >> on adding test data to a new directory similar to Bio::Graphics (t/ >> data/eutil) to prevent cluttering of the t/data directory. I >> could use '$obj->new(-file => test_input_file('/eutil/ >> input.xml'))' if the base directory is 't/data' but that may not >> be cross-platform compatible with win32 file systems, which may >> still expect something like 't\data\eutil\input.xml'. > > Its platform-independent, currently implemented using File::Spec. > So you'll say: > > $obj->new(-file => test_input_file('eutil', 'input.xml')); > > Its all documented in the POD of BioperlTest. yay! chris From mmokrejs at ribosome.natur.cuni.cz Mon Jun 25 12:06:24 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Mon, 25 Jun 2007 18:06:24 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <467254DD.3010505@mrc-lmb.cam.ac.uk> Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz> Dave Howorth wrote: > Martin MOKREJ? wrote: >>>> Also, there is a *huge* amount of documentation and examples on >>>> the BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> ? ;-) >> $ perl embl2picture.pl ~/99.gb | display - Error returned while >> evaluating value of 'description' option for glyph >> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl >> line 141, line 125. > > Hmm an error at line 141 of a 69 line script? Methinks you're not > actually running the script that's presented on the wiki page you > quoted. I cut-and-pasted the script and your file and it worked for me > (at least, it produced an image, along with a bunch of OOPS lines) Maybe you used the first version of the script? There are two or more scripts, I used the very last one. M. From cjfields at uiuc.edu Mon Jun 25 12:48:30 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 11:48:30 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> <467FE7B0.3010904@ribosome.natur.cuni.cz> Message-ID: Martin, Keep bioperl-related discussion on the bioperl mail list. The large majority of this isn't biopython-related, but maybe some devs there can add to this? On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote: ... > Would you please tell me exactly what is wrong with the spacing? Here's a section of the seq record attached to your previous email: DEFINITION . ACCESSION . VERSION . SOURCE . ORGANISM . Normally there is a fixed column width for any data present in a field, so it would look more like this: DEFINITION PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase); dihydroorotase [Arabidopsis thaliana]. ACCESSION NP_194024 VERSION NP_194024.1 GI:15235865 DBSOURCE REFSEQ: accession NM_118422.3 KEYWORDS . SOURCE Arabidopsis thaliana (thale cress) ORGANISM Arabidopsis thaliana Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons; rosids; eurosids II; Brassicales; Brassicaceae; Arabidopsis. Here's the relevant bit in the latest release notes: "The second part of each sequence entry record contains the information appropriate to its keyword, in positions 13 to 80 for keywords and positions 11 to 80 for the sequence." The bioperl devs try to make our parsers as flexible as possible but others may not, so it's something in ApE that should probably be fixed. And as mentioned to you several times in the past on the mail list and on bugzilla, don't expect sequence records which sway from the standard (in this case, the release notes) to parse correctly in all cases. We can try supporting some that sway from that standard but only up to a point. If it causes additional bugs, headaches, or degrades performance it won't be supported. > ... > Well, I just copy&pasted the script from the bioperl webpages, I think > from a tutorial or FAQ, don't remember anymore. Well, can't help you if you can't point out where the code originated from. We would like to know so it can be corrected. > ... > Well, my search for such tools available on Unix to be used in a > script, > non-interactively, completely failed. My last hope except getting > improved > ApE is to use the GenomeDiagram under biopython, but so far my .gb > files > cannot be parsed yet. :( > Martin As mentioned previously you will likely have to code for it yourself (perl or python) or help debug the relevant biopython code to get it working. We can't/won't do this for you unless/until it's something we feel warrants implementation. Judging by the bug list, we also haven't the time nor inclination to code for it. Sorry but we have other priorities besides doing your work for you. chris From jesper at krogh.cc Tue Jun 26 03:05:32 2007 From: jesper at krogh.cc (Jesper Krogh) Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST) Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Hi List. Trying to parse the embl database, the embl-parser fails on: AB019196 http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: AB019196 seems to have an invalid species classification. STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 STACK: Bio::SeqIO::embl::_read_EMBL_Species /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 STACK: Bio::SeqIO::embl::next_seq /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 STACK: -e:1 ----------------------------------------------------------- It seems to be dissatisfied with this: OS Acetobacter aceti OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. Thanks. -- Jesper Krogh From cjfields at uiuc.edu Tue Jun 26 09:13:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 08:13:50 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> I can verify this using bioperl-live. Can you file this as a bug? http://bugzilla.open-bio.org/ chris On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > Hi List. > > Trying to parse the embl database, the embl-parser fails on: AB019196 > http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: AB019196 seems to have an invalid species classification. > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 > STACK: Bio::SeqIO::embl::_read_EMBL_Species > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 > STACK: Bio::SeqIO::embl::next_seq > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 > STACK: -e:1 > ----------------------------------------------------------- > > > It seems to be dissatisfied with this: > OS Acetobacter aceti > OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; > OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. > > Thanks. > -- > Jesper Krogh > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From suji_ramin at yahoo.com Tue Jun 26 00:58:36 2007 From: suji_ramin at yahoo.com (SujiBala) Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT) Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com> Hi Hello This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. Error messasge Must supply a valid Bio::Align::AlignI for the _align parameter in the distance My program use Bio::AlignIO; use Bio::Align::DNAStatistics; use Bio::Tree::DistanceFactory; # for a dna alignment can also use ProteinStatistics @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); $stats = Bio::Align::DNAStatistics->new; $mat = $stats->distance( -align => @aln,-method => 'Kimura'); $dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ'); $tree = $dfactory->make_tree($mat); I am using clustalw formatted fasta file with more than one sequence SujiBala --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. From bartels.stefan at mh-hannover.de Tue Jun 26 05:26:03 2007 From: bartels.stefan at mh-hannover.de (don esteban) Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT) Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <11302459.post@talk.nabble.com> Try using the Proxyconfiguration in your script: $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; L Xu wrote: > > I do have the internet connection bu not use the proxy server. > I tested the network connection with ping command (below). The ncbi > website > does not response. Is there any special network setting needed for > connecting the ncbi website? > Thank you so much. > > C:\>ping www.yahoo.com > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > Ping statistics for 69.147.114.210: > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > Approximate round trip times in milli-seconds: > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > C:\>ping www.ncbi.nlm.nih.gov > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > Request timed out. > Request timed out. > Request timed out. > Request timed out. > > Ping statistics for 130.14.29.110: > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > = = = Original message = = = > > Judging by the output it looks like you have no network access or? can't > connect to the server (what remoteblast needs).? Make sure you? don't need > proxy settings. > > To preempt the next question, no, I'm not going to explain what a? proxy > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > tool... > > chris > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From rahall2 at ualr.edu Tue Jun 26 09:51:08 2007 From: rahall2 at ualr.edu (Roger Hall) Date: Tue, 26 Jun 2007 08:51:08 -0500 Subject: [Bioperl-l] Tuesday: ill Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2> Well I guess I won't be in today after all. Michael, Stephen, and Ames: please call me from the grad office at 10 on my cell phone (744-8514). Phil: please go ahead and meet with Tim, and let me know what questions remain afterwards. Thanks! Roger Hall Technical Director MidSouth Bioinformatics Center University of Arkansas at Little Rock (501) 569-8074 From cjfields at uiuc.edu Tue Jun 26 10:02:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 09:02:29 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <4681185D.5030402@cam.ac.uk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> <4681185D.5030402@cam.ac.uk> Message-ID: Ill try getting to that ASAP (as well as a few bugs). The problem is we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due to repeated code issues, something I'm trying to rectify with a new set of parsers. Just haven't had the time to work on them lately unfortunately. chris On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote: > Sorry, replied to this but forgot to cc the list. > > It looks like a related problem to bug 2288 that I filed about > Bio::SeqIO::swiss - the period after subgen. is what causes the > problems since it is interpreted as a seperator between nodes. I > put a patch in for Bio::SeqIO::swiss that works for me, but I guess > it might have side effects. > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. > > Chris Fields wrote: >> I can verify this using bioperl-live. Can you file this as a bug? >> http://bugzilla.open-bio.org/ >> chris >> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: >>> Hi List. >>> >>> Trying to parse the embl database, the embl-parser fails on: >>> AB019196 >>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >>> >>> >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: AB019196 seems to have an invalid species classification. >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ >>> Root.pm:359 >>> STACK: Bio::SeqIO::embl::_read_EMBL_Species >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >>> STACK: Bio::SeqIO::embl::next_seq >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >>> STACK: -e:1 >>> ----------------------------------------------------------- >>> >>> >>> It seems to be dissatisfied with this: >>> OS Acetobacter aceti >>> OC Bacteria; Proteobacteria; Alphaproteobacteria; >>> Rhodospirillales; >>> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >>> >>> Thanks. >>> -- >>> Jesper Krogh >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rrc22 at cam.ac.uk Tue Jun 26 09:45:01 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 26 Jun 2007 14:45:01 +0100 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> Message-ID: <4681185D.5030402@cam.ac.uk> Sorry, replied to this but forgot to cc the list. It looks like a related problem to bug 2288 that I filed about Bio::SeqIO::swiss - the period after subgen. is what causes the problems since it is interpreted as a seperator between nodes. I put a patch in for Bio::SeqIO::swiss that works for me, but I guess it might have side effects. Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. Chris Fields wrote: > I can verify this using bioperl-live. Can you file this as a bug? > > http://bugzilla.open-bio.org/ > > chris > > On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > >> Hi List. >> >> Trying to parse the embl database, the embl-parser fails on: AB019196 >> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: AB019196 seems to have an invalid species classification. >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 >> STACK: Bio::SeqIO::embl::_read_EMBL_Species >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >> STACK: Bio::SeqIO::embl::next_seq >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >> STACK: -e:1 >> ----------------------------------------------------------- >> >> >> It seems to be dissatisfied with this: >> OS Acetobacter aceti >> OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; >> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >> >> Thanks. >> -- >> Jesper Krogh >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Tue Jun 26 10:13:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 26 Jun 2007 15:13:48 +0100 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> Message-ID: <46811F1C.3020307@sendu.me.uk> SujiBala wrote: > Hi Hello > This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. > > Error messasge > Must supply a valid Bio::Align::AlignI for the _align parameter in the distance > My program > use Bio::AlignIO; > use Bio::Align::DNAStatistics; > use Bio::Tree::DistanceFactory; > # for a dna alignment can also use ProteinStatistics > @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); > $stats = Bio::Align::DNAStatistics->new; > $mat = $stats->distance( -align => @aln,-method => 'Kimura'); Without looking at the docs for these modules, it is immediately obvious that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO and not an array of alignments. It is also obvious that the -align => parameter for the distance() method can't take an array of anything (but probably an array ref?). Check the documentation and make sure you know what objects you're generating and passing around. From schlesi at ebi.ac.uk Tue Jun 26 10:59:13 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Tue, 26 Jun 2007 15:59:13 +0100 Subject: [Bioperl-l] PAML parser Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Hello, I am trying to use the PAML result parser (BioPerl Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. However on all outputs I have tested no result object is returned (next_result is undef). This includes the HIV and Lysin datasets included with PAML. My code is: my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => "/."); my $result = $codemlp->next_result; foreach my $model ( $result->get_NSSite_results ) { ... and the error is: Can't call method "get_NSSite_results" on an undefined value ... I can include the mlc file is needed. Is this supposed to work? Or do I have to run paml from bioperl to parse the results? Thanks Felix From Xianjun.Dong at bccs.uib.no Tue Jun 26 10:35:17 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 16:35:17 +0200 Subject: [Bioperl-l] bug for PAML::Baseml Message-ID: <46812425.8000509@ii.uib.no> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/cb3d8193/attachment-0001.html From Xianjun.Dong at bccs.uib.no Tue Jun 26 11:40:47 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 17:40:47 +0200 Subject: [Bioperl-l] bug for PAML::Baseml In-Reply-To: <46812425.8000509@ii.uib.no> References: <46812425.8000509@ii.uib.no> Message-ID: <4681337F.1000902@ii.uib.no> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/604ce866/attachment.html From hartzell at alerce.com Tue Jun 26 14:12:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 14:12:04 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.22260.967524.353173@almost.alerce.com> There don't seem to be any .cvsignore files in the repository, or in CVSROOT/cvsignore. Am I missing something, or don't we use them? g. From cjfields at uiuc.edu Tue Jun 26 15:54:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 14:54:25 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu> Not sure. You may want to email support at open-bio.org; my guess is Chris D or Jason would have an answer. chris On Jun 26, 2007, at 1:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Tue Jun 26 15:55:21 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 26 Jun 2007 16:55:21 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: Maybe we've been using the default? On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Tue Jun 26 16:21:30 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 16:21:30 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.30026.61328.134490@almost.alerce.com> Chris Fields writes: > [...] > It looks like George Hartzell may be taking a crack at it, with > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > could have something testable relatively soon. After that we'll need > to work out a few other issues, basically what's on Hilmar's list. There's a repository on file:///home/hartzell/bioperl with all of the components projects in place. If you have a dev.open-bio.org account and you're in the bioperl group, you're good to get at it via: file:///home/hartzell/bioperl or svn+ssh://dev.open-bio.org/home/hartzell/bioperl There are a couple of things to think about: - how are we going to provide access. I *think* that I heard a decision to use http:// and https://. Who gets to set that up? - what do we want to do about keywords. The cvs2svn tool guesses and automatically sets the svn:keywords property to Author Date Revision and Id on many of the files in the tree. If it looks like it got it right, we can stick with it. Or, we can disable that conversion and I've cribbed a little script that'll grep out files using Id and set the svn:keywords property accordingly. - what do we want to do about svn:ignore? I haven't seen any .cvsignore files. Beyond that, how does the repo look? How are we going to cut over? Are we going to try to push svn commits to the read-mostly CVS repo, or just keep it around for history's sake (I lean towards the latter). g. From jason at bioperl.org Tue Jun 26 19:22:20 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:22:20 -0300 Subject: [Bioperl-l] PAML parser In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Message-ID: Can you make sure you have the latest and greatest version of these modules from the CVS repository? We had to fix things to parse 3.15 -- I can't tell if this is the problem or something else. You can also add -verbose => 1when you initialize the object and it may spit out more warnings about whether it is having problems. -jason On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote: > Hello, > > I am trying to use the PAML result parser (BioPerl > Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. > However on all outputs I have tested no result object is returned > (next_result is undef). This includes the HIV and Lysin datasets > included with PAML. > My code is: > > my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => > "/."); > my $result = $codemlp->next_result; > foreach my $model ( $result->get_NSSite_results ) { > ... > > and the error is: Can't call method "get_NSSite_results" on an > undefined value ... > > I can include the mlc file is needed. Is this supposed to work? Or do > I have to run paml from bioperl to parse the results? > > Thanks > Felix > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 19:27:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:27:05 -0300 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <46811F1C.3020307@sendu.me.uk> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> <46811F1C.3020307@sendu.me.uk> Message-ID: On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote: > SujiBala wrote: >> Hi Hello >> This is sujatha from singapore. I am trying to construct phylo >> tree using DNAStatistics and Kirma method. But I am getting the >> following error message. It would be nice if you could help me >> resolve this problem asap. >> >> Error messasge >> Must supply a valid Bio::Align::AlignI for the _align >> parameter in the distance >> My program >> use Bio::AlignIO; >> use Bio::Align::DNAStatistics; >> use Bio::Tree::DistanceFactory; >> # for a dna alignment can also use ProteinStatistics >> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); >> $stats = Bio::Align::DNAStatistics->new; >> $mat = $stats->distance( -align => @aln,-method => 'Kimura'); > yep you want to call next_aln on the Bio::AlignIO object. I fixed the example code in the HOWTO so it should work properly now; http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees > Without looking at the docs for these modules, it is immediately > obvious > that Bio::AlignIO->new() is going to return an instance of > Bio::AlignIO > and not an array of alignments. It is also obvious that the -align => > parameter for the distance() method can't take an array of anything > (but > probably an array ref?). > > Check the documentation and make sure you know what objects you're > generating and passing around. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 19:29:11 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:29:11 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org> We don't have one. I have one on my local machine that defined basically *~ and .#* so I never had a problem. Feel free to propose one if you think it is important, I never really though it was important. On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote: > Maybe we've been using the default? > > On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > >> >> There don't seem to be any .cvsignore files in the repository, or in >> CVSROOT/cvsignore. >> >> Am I missing something, or don't we use them? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From j_martin at lbl.gov Tue Jun 26 21:01:29 2007 From: j_martin at lbl.gov (Joel Martin) Date: Tue, 26 Jun 2007 18:01:29 -0700 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: <11302459.post@talk.nabble.com> References: <11302459.post@talk.nabble.com> Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org> Hello, The tutorial code snippet is an endless loop, I think it's supposed to remove the rid. As the only print statement you added is after the endless loop, you aren't seeing anything happen. Use the code from this instead, perldoc Bio::Tools::Run::RemoteBlast The bptutorial.pl does have a note that it's not useful and to read the pod for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code snippet you used. Though, as it's a tutorial example it might be nice to remove the while loop .. or at least add the sleep(5) part. http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29 Aside from that, you may have network issues but www.ncbi.nlm.nih.gov doesn't respond to ping as far as I can tell. Joel On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote: > > Try using the Proxyconfiguration in your script: > > $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; > > > > > L Xu wrote: > > > > I do have the internet connection bu not use the proxy server. > > I tested the network connection with ping command (below). The ncbi > > website > > does not response. Is there any special network setting needed for > > connecting the ncbi website? > > Thank you so much. > > > > C:\>ping www.yahoo.com > > > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > > > Ping statistics for 69.147.114.210: > > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > > Approximate round trip times in milli-seconds: > > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > > > C:\>ping www.ncbi.nlm.nih.gov > > > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > > > Request timed out. > > Request timed out. > > Request timed out. > > Request timed out. > > > > Ping statistics for 130.14.29.110: > > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > > > > > = = = Original message = = = > > > > Judging by the output it looks like you have no network access or? can't > > connect to the server (what remoteblast needs).? Make sure you? don't need > > proxy settings. > > > > To preempt the next question, no, I'm not going to explain what a? proxy > > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > > tool... > > > > chris > > > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > > > > ... > > -------------------- WARNING --------------------- > > MSG: > > An Error Occurred > > > >

An Error Occurred

> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > > > > > --------------------------------------------------- > > ... > > > > ___________________________________________________________ > > Sent by ePrompter, the premier email notification software. > > Free download at http://www.ePrompter.com. > > > > _________________________________________________________________ > > Get a preview of Live Earth, the hottest event this summer - only on MSN > > http://liveearth.msn.com?source=msntaglineliveearthhm > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melvinp at pacific.net.sg Wed Jun 27 01:25:08 2007 From: melvinp at pacific.net.sg (Melvin P) Date: Wed, 27 Jun 2007 13:25:08 +0800 Subject: [Bioperl-l] finding statistics on AA Message-ID: <4681F4B4.8010609@pacific.net.sg> Hi, I am new to BioPerl. I am trying to find out if there is any class that I can use for occupancy number/occurrence counts, psuedo count, observed frequency etc given a few sequences of amino acid. For example, what is the observed frequency of residue i at position p. My objective is to analyze the information content. Thanks. From bix at sendu.me.uk Wed Jun 27 06:23:58 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 11:23:58 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <46823ABE.2080300@sendu.me.uk> Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] >> t/lib/BioperlTest.pm > > I'm now in the process of converting all test scripts. And I've now completed that job (for bioperl-live at least), except for t/EUtilities.t since I know Chris is working on it. In addition to converting to Test::More where necessary, I've also made all psuedo-TODO blocks real ones. Previously I had advised to use SKIP blocks instead since TODO blocks need a Test::Harness upgrade. However I think in the next release we ought to make such upgrading compulsory (which should be automatic when combined with compulsory usage of Module::Build and Test::More in turn: users simply have to update CPAN). The conversion to BioperlTest directly led to the discovery and fixing of 6 minor bugs, so was certainly not without merit. No user or developer needs to have BIOPERLDEBUG permanently set to true anymore. To run all tests you just have to answer yes to the BioDBGFF and networking questions of 'perl Build.PL'. With './Build test' you then get clean, easy-to-read output where it is obvious to see that we currently have these issues: t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in another thread. t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and t/Annotation.t all have TODO tests. If you know about those modules, now would be a great time to implement those TODOs! Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are deprecated' warnings. To debug a particular test you could say: BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t I've updated the HOWTO for writing test scripts: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests From cjfields at uiuc.edu Wed Jun 27 07:55:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 06:55:47 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except > for > t/EUtilities.t since I know Chris is working on it. The network tests will be much shorter; the bulk will be transferred to a new suite for the backend Bio::Tools:EUtilities parser (which will test static files in t/data/eutils, so no dynamic changes). > In addition to converting to Test::More where necessary, I've also > made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. > However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update > CPAN). Sounds good to me, but there may be some grumblings out there. Having specific TODOs are nice b/c we can test them w/o fails. Handy. > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to > true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those > modules, now > would be a great time to implement those TODOs! The RNA_SearchIO.t is from ERPIN output; there's no easy way to generate it beyond having the user supply the info (or having the program author change the output). Will have to look at the others to see what's involved; maybe something for the priority list? > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. I ran into this with XML::Simple data structures recently; there was an easy way around it via XML::Simple using forcearray(). It has to do with attempting to assign data to/from a hash in a specific way involving array references (though I can't remember exactly how; I slept since then). > To debug a particular test you could say: > BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t > > > I've updated the HOWTO for writing test scripts: > http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Good work! chris From schlesi at ebi.ac.uk Wed Jun 27 07:57:27 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Wed, 27 Jun 2007 12:57:27 +0100 Subject: [Bioperl-l] Selecting columns from alignment Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com> Hi, is there an elegant way to select columns from an alignment object fulfilling a certain property (for example less than x gaps)? Everything I can see from Align::AlignI seems to involve looking at the individual sequences, creating lots of slices and appending them. If there a better way in bioperl or failing that, does anyone know a software package with similar functionality (t-coffee has lots of filters for alignments, but nothing to select columns besides by position it seems). Ideally this would also return a mapping from old to new positions in one of the sequences of course. Thanks Felix From cjfields at uiuc.edu Wed Jun 27 10:36:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 09:36:41 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ... > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I managed to get it working using file://. Haven't tried svn+ssh yet but I've had persistent problems getting ssh to work properly on my macbook; not sure why yet but I haven't had time to play around with it. > There are a couple of things to think about: > > - how are we going to provide access. I *think* that I heard a > decision to use http:// and https://. Who gets to set that up? That hasn't been decided yet and will be up to a consensus of the core devs, but I think the odds are in favor of allowing https:// but against allowing http://. As for setup that could be anyone with admin privs, though it may be best left up to Chris D, Jason, or Mauricio. > - what do we want to do about keywords. The cvs2svn tool guesses > and automatically sets the svn:keywords property to Author Date > Revision and Id on many of the files in the tree. If it looks > like it got it right, we can stick with it. Or, we can disable > that conversion and I've cribbed a little script that'll grep out > files using Id and set the svn:keywords property accordingly. Probably again a consensus issue, but you can choose one route. My inclination is the former if it's easier. > - what do we want to do about svn:ignore? I haven't seen any > .cvsignore files. Not sure. I've never used one personally, but (as Jason suggests) if you have ideas for one you can propose them, or we can suggest devs set up svn::ignore locally. > Beyond that, how does the repo look? Seems fine, though a simple 'svn file:///home/hartzell/bioperl' checkout gets everything (all distros, branches, etc). We need to make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- live/trunk /live' or similar if they just want the latest core/db/etc. We'll also need to start a svn wiki page to show how to get relevant distros (similar in style probably to the cvs page, with dev information, how to set up ssh keys, https stuff, etc). > How are we going to cut over? > > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I think a clean cut-over. Everyone would be warned to hold commits for a day (lest they be lost), then probably do something in this order: - switch cvs to read-only except for svn commits - run a clean cvs2svn - set up svn as read/write - set up test commits to cvs via svn - disable cvs commit messages to bioperl-guts, enable svn commit messages in it's place. - push svn commits over to read-only cvs cvs >>must<< be read-only after that point (no cvs->svn commits), with write access only available through svn. If at some future point there is no reason to keep it around or that it is more trouble than it's worth, we can make a decision then on cvs's fate. > g. chris From rvos at interchange.ubc.ca Wed Jun 27 10:23:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT) Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point. Rutger From cjfields at uiuc.edu Wed Jun 27 11:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 10:18:03 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> On Jun 27, 2007, at 9:23 AM, rvos wrote: > >> Are we going to try to push svn commits to the read-mostly CVS repo, >> or just keep it around for history's sake (I lean towards the >> latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. > > Rutger Most projects make a clean break with cvs (no more commits) for the reasons you point out. Not sure how the other core devs feel about that but I could go for that; it would def. prevent headaches. We could keep cvs for the time being as read-only, with no svn->cvs syncing. There are few projects which have (as a phase-out plan) old read-only cvs repositories available, with an automatic svn->cvs commit following every new svn commit. Not sure how that works, esp. for branching/merging and so on which I could see potentially getting hairy. chris From cjfields at uiuc.edu Wed Jun 27 12:05:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 11:05:49 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ...If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl Did manage to get svn+ssh working (with some password harassment); core tests passed enough that I think everything's okay. If ssh keys are set up correctly (mine aren't) it should work fine. chris From dmessina at wustl.edu Wed Jun 27 12:27:32 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 11:27:32 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: > [Chris] > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around > with it. I just did a checkout and a test commit, both via svn+ssh -- works great for me. >> [George] >> >> - what do we want to do about keywords. The cvs2svn tool guesses >> and automatically sets the svn:keywords property to Author Date >> Revision and Id on many of the files in the tree. If it looks >> like it got it right, we can stick with it. Or, we can disable >> that conversion and I've cribbed a little script that'll grep out >> files using Id and set the svn:keywords property accordingly. I would think we would want "Author Date Id Rev URL" set on everything, no?. So either cvs2svn or your tool (whichever you think is better), followed by svn propset svn:keywords "Author Date Id Rev URL" * from the root of a working copy would take care of all of the existing files in the repository, I think. George knows more about this than I do, but I think you can set up a global config file with enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" to ensure it gets set on any future additions to the repository. >> - what do we want to do about svn:ignore? I haven't seen any >> .cvsignore files. > > Not sure. I've never used one personally, but (as Jason suggests) if > you have ideas for one you can propose them, or we can suggest devs > set up svn::ignore locally. I use the default global-ignores global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store (again, in my system-wide config file), but I'm not tied to that. I do think we should have one, though; individuals can easily override any settings in the system-wide config with their own ~/.subversion/ config. >> Beyond that, how does the repo look? Looks great, George! Thanks for doing this. Dave From hartzell at alerce.com Wed Jun 27 13:00:53 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 13:00:53 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <18050.38853.526224.791878@almost.alerce.com> rvos writes: > > > Are we going to try to push svn commits to the read-mostly CVS repo, > > or just keep it around for history's sake (I lean towards the latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. There had been some point of keeping a CVS repository around as a read-only mirror of the svn repo, presumably for people who's habits or setup won't let them use svn. In theory, each commit to the svn repo can be automagically pushed down into CVS w/out user intervention, google will tell you how but I've never run anything that way. g. From dmessina at wustl.edu Wed Jun 27 13:27:01 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 12:27:01 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu> > [Chris] > We'll also need to start a svn wiki page to show how to get relevant > distros (similar in style probably to the cvs page, with dev > information, how to set up ssh keys, https stuff, etc). I cloned the CVS page and have started adapting it for Subversion: http://www.bioperl.org/wiki/Using_Subversion I'll do some more on it later today, but if anyone wants to fiddle with it in the interim, please do. Dave From n.haigh at sheffield.ac.uk Wed Jun 27 14:44:16 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 19:44:16 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: <4682B000.2050707@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except for > t/EUtilities.t since I know Chris is working on it. > > > In addition to converting to Test::More where necessary, I've also made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update CPAN). > > > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those modules, now > would be a great time to implement those TODOs! > > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. Ah, that reminds me! I recently tried to do an install of the cvs head (a week or two ago) on a clean installation of Debian 4.0 (etch). During the installation, of dependencies, Bio::ASN1::EntrezGene threw an error as it depends on Bioperl. I seem to remember this circular dependency cropping up before - am I correct - and can you remind me how this was "fixed"? Cheers Nath From bix at sendu.me.uk Wed Jun 27 14:52:01 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 19:52:01 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B1D1.3080206@sendu.me.uk> Nathan S. Haigh wrote: > I recently tried to do an install of the cvs head (a week or two ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up before > - am I correct - and can you remind me how this was "fixed"? Yes, it always happens. It was 'fixed' by being completely ignored by me. Installation is guaranteed to fail, but if you really want it, trying to install again after you already have Bioperl installed will result in success. Clearly something nicer could be done. Suggestions on a postcard... From cjfields at uiuc.edu Wed Jun 27 15:01:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:01:01 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > Sendu Bala wrote: >> ... >> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >> deprecated' warnings. > > Ah, that reminds me! > > I recently tried to do an install of the cvs head (a week or two > ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up > before > - am I correct - and can you remind me how this was "fixed"? > > Cheers > Nath Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of Bioperl (and he could be come a dev). That would solve it. chris From n.haigh at sheffield.ac.uk Wed Jun 27 15:16:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 20:16:40 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B798.1010409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > >> Sendu Bala wrote: >>> ... >>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >>> deprecated' warnings. >> >> Ah, that reminds me! >> >> I recently tried to do an install of the cvs head (a week or two ago) on >> a clean installation of Debian 4.0 (etch). During the installation, of >> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on >> Bioperl. I seem to remember this circular dependency cropping up before >> - am I correct - and can you remind me how this was "fixed"? >> >> Cheers >> Nath > > Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of > Bioperl (and he could be come a dev). That would solve it. > > chris Just to put the feelers out to see what people think. It seems (to me at least) that Bioperl modules could/should? be released as individual modules and that "bioperl" would really constitute a "bundle" of all these modules - in terms of CPAN anyway. Am I correct in this thinking? The Bio::ASN1::EntrezGene could simply require a particular module rather than the whole of bioperl - might get out of the circular dependency theoretically!? I'm not suggesting moving in this direction, but just wondered what others thought about this concept? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X tOFQUQ7cGJLUITEDw1+QLxc= =Yc+g -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 15:31:44 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:31:44 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu> On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote: > ... > > Just to put the feelers out to see what people think. > > It seems (to me at least) that Bioperl modules could/should? be > released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I > correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? > > I'm not suggesting moving in this direction, but just wondered what > others thought about this concept? > > Nath Well, Steve suggested splitting some of core into distinct groups, which I tend to agree with in some respects (speed up releases for those modules, such as SearchIO, DB, Graphics). The problem we have yet to solve is what we consider 'core'. Is it Bio::Seq and related? Should it include Bio::DB*? Should it just be Bio::* modules with no or very few external dependencies? And so on..., probably not a decision we want to make immediately (until after svn migration, tests finished, maybe a release or two, a beer)... The Bioperl module dependency that Bio::ASN1::EntrezGene has is Bio::Index::AbstractSeq. You could try a test build of Bio::ASN1::EntrezGene to see what happens. chris From hlapp at gmx.net Wed Jun 27 15:49:15 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:49:15 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 1:27 PM, David Messina wrote: > I would think we would want "Author Date Id Rev URL" set on > everything, no?. So either cvs2svn or your tool (whichever you think > is better), followed by > > svn propset svn:keywords "Author Date Id Rev URL" * Shouldn't this be done recursively? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 15:50:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:50:27 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > Most projects make a clean break with cvs (no more commits) for the > reasons you point out. Not sure how the other core devs feel about > that but I could go for that; it would def. prevent headaches. There shouldn't be any cvs write support after the cut-over I think. I don't see the benefit that would justify the huge headache potential. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 16:01:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:01:40 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I > think. I don't see the benefit that would justify the huge headache > potential. > > -hilmar Agreed, so maybe we should set that in stone. That means no svn->cvs syncing post-migration as well, I assume. Now how about a quick straw poll, what kind of access? svn+ssh is already available, but some (Aaron among them) have indicated they would like https as well (not sure how involved it would be to set up). chris From hlapp at gmx.net Wed Jun 27 16:08:40 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:08:40 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > That means no svn->cvs syncing post-migration as well, I assume. That's a bit of a different story. People out there have URL links into our anonymous CVS repository. If it's not too troublesome (and tend to I think it's not) I'd like to maintain those in working order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi script that maps between the URL flavors (i.e., that maps a CVS-style URL to the equivalent SVN link). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 16:15:10 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 16:15:10 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18050.50510.84363.355034@almost.alerce.com> David Messina writes: > > [Chris] > > > > I managed to get it working using file://. Haven't tried svn+ssh yet > > but I've had persistent problems getting ssh to work properly on my > > macbook; not sure why yet but I haven't had time to play around > > with it. > > I just did a checkout and a test commit, both via svn+ssh -- works > great for me. Is there anyone working outside of bioperl-{run,live,ext}? g. From bix at sendu.me.uk Wed Jun 27 16:22:13 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 21:22:13 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <4682C6F5.4020406@sendu.me.uk> Nathan S. Haigh wrote: > It seems (to me at least) that Bioperl modules could/should? be released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? No, it wouldn't. The 'problem' only arises because the user is /choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same time. So even if Bioperl was released as separate modules there would still be that 'bundle' and users would still choose to do the same thing: install all the Bioperl modules as well as all its /optional/ recommended modules. And there lies the problem: Bio::ASN1::EntrezGene requires Bioperl modules, and one Bioperl module requires Bio::ASN1::EntrezGene, so the circularity isn't solved. (FYI: Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq Bio::Index::AbstractSeq requires a couple of Bioperl modules, including Bio::Root::Root Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of Bioperl modules, including Bio::Root::Root. ) You only avoid circularity by choosing not to install everything in one go. Which is something you can do right now with no problems. From n.haigh at sheffield.ac.uk Wed Jun 27 16:24:18 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 21:24:18 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <4682C772.5070502@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I think. > I don't see the benefit that would justify the huge headache potential. > > -hilmar I agree. A clean switch from cvs read/write to svn read/write plus cvs read only sounds the least problematic! However, how will links to cvs be dealt with? Links on Bioperl could be switched over to point to svn, but what about possible links from external sources? Maybe a more generic approach of redirection could work? Or a simple warning page stating the fact that we have moved from cvs to svn and provide a common link to follow? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y 81KurFwJlRtYFxSmLZP56Sk= =pp7b -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 16:30:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:30:19 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > Cool - this works for me. One thing I notice is that in cvs log you see which version is in which branch which is useful to answer user queries that might be a version problem. svn log doesn't seem to want to show that. Does anyone have ideas for how to do this in svn? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 16:32:18 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:32:18 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4682C772.5070502@sheffield.ac.uk> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <4682C772.5070502@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote: > However, how will links to cvs be dealt with? Well I said before that probably one can write a couple of lines of Perl to write a cgi script that returns the appropriate redirect URL with a redirect status code. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y f6sJ/ngeKEGpKHgyAHM1DAA= =8n0E -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 16:50:11 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:50:11 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote: > > On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl >> > > Cool - this works for me. > > One thing I notice is that in cvs log you see which version is in > which branch which is useful to answer user queries that might be a > version problem. svn log doesn't seem to want to show that. Does > anyone have ideas for how to do this in svn? > > -hilmar We prob. should move it to a new directory ASAP which george can write to when he needs to update. cvs is in /home/repository/ bioperl, so maybe something similar, like /home/svn/repository/bioperl? chris From cjfields at uiuc.edu Wed Jun 27 16:51:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:51:37 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu> On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > >> That means no svn->cvs syncing post-migration as well, I assume. > > That's a bit of a different story. People out there have URL links > into our anonymous CVS repository. If it's not too troublesome (and > tend to I think it's not) I'd like to maintain those in working > order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi > script that maps between the URL flavors (i.e., that maps a CVS- > style URL to the equivalent SVN link). > > -hilmar I'll try getting a wiki page up as a checklist for this, including what direction we're heading in, ideas (your list and CGI redirect ideas, svn::ignore issues, etc). Dave has already started on the 'getting bioperl using svn' wiki page. If we intend to sync cvs with svn we need to find the right tools or at least check for other projects which have done something similar. I haven't googled on that yet but I'll attempt to tonight. chris From cjfields at uiuc.edu Wed Jun 27 16:53:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:53:08 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> bioperl-run also. I think the run CVS repo has some binary files, so if there are any problems with cvs2svn it'll be there. chris On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > George, > > bioperl-db and bioperl-network should be included, I think. > > Brian O > > > On 6/27/07 4:15 PM, "George Hartzell" wrote: > >> David Messina writes: >>>> [Chris] >>>> >>>> I managed to get it working using file://. Haven't tried svn >>>> +ssh yet >>>> but I've had persistent problems getting ssh to work properly on my >>>> macbook; not sure why yet but I haven't had time to play around >>>> with it. >>> >>> I just did a checkout and a test commit, both via svn+ssh -- works >>> great for me. >> >> Is there anyone working outside of bioperl-{run,live,ext}? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Jun 27 17:05:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 22:05:50 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682C6F5.4020406@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> Message-ID: <4682D12E.3000803@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> It seems (to me at least) that Bioperl modules could/should? be released >> as individual modules and that "bioperl" would really constitute a >> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >> this thinking? The Bio::ASN1::EntrezGene could simply require a >> particular module rather than the whole of bioperl - might get out of >> the circular dependency theoretically!? > > No, it wouldn't. [snip] > You only avoid circularity by choosing not to install everything in one > go. Errr... I take that back. Since CPAN bundles install things in a certain order, you just have to make sure that everything Bio::ASN1::EntrezGene needs is installed first, then Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. But the main problem with this approach is that maintenance, global-style code improvements and releases become a nightmare. I could, perhaps, imagine a scenario where the repository stayed as-is (one monolithic collection), but the dist action of Build.PL could be altered to generate a release package per module instead of one big release package of all modules, as is currently the case. Is there much value in doing that? Does anyone want me to look into the feasibility of such a thing? From bosborne11 at verizon.net Wed Jun 27 16:19:47 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 27 Jun 2007 16:19:47 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18050.50510.84363.355034@almost.alerce.com> Message-ID: George, bioperl-db and bioperl-network should be included, I think. Brian O On 6/27/07 4:15 PM, "George Hartzell" wrote: > David Messina writes: >>> [Chris] >>> >>> I managed to get it working using file://. Haven't tried svn+ssh yet >>> but I've had persistent problems getting ssh to work properly on my >>> macbook; not sure why yet but I haven't had time to play around >>> with it. >> >> I just did a checkout and a test commit, both via svn+ssh -- works >> great for me. > > Is there anyone working outside of bioperl-{run,live,ext}? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Wed Jun 27 17:25:53 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 22:25:53 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <4682D5E1.2030507@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get out of >>> the circular dependency theoretically!? >> >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything in >> one go. > > Errr... I take that back. Since CPAN bundles install things in a certain > order, you just have to make sure that everything Bio::ASN1::EntrezGene > needs is installed first, then Bio::ASN1::EntrezGene, then > Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, > global-style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be altered > to generate a release package per module instead of one big release > package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into the > feasibility of such a thing? I think the value would be in other external modules being able to use bioperl modules with more ease (not sure how many modules have, or currently depend on bioperl) as they would depend on a single module, rather than the whole package. However, how would the dependencies of each module be handled? I'm clearly thinking aloud, but....Maybe this would tease apart "cliques" of modules that are interdependent? and could in themselves be shipped as bundles e.g. Bio::Graphics and have a "master" bioperl bundle that installa all the bioperl modules. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB 2EZjccEFEzfFlx4H47gzwLk= =nobl -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 17:35:28 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 18:35:28 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Is there a reason not to port every subproject over? -hilmar On Jun 27, 2007, at 5:53 PM, Chris Fields wrote: > bioperl-run also. I think the run CVS repo has some binary files, so > if there are any problems with cvs2svn it'll be there. > > chris > > On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > >> George, >> >> bioperl-db and bioperl-network should be included, I think. >> >> Brian O >> >> >> On 6/27/07 4:15 PM, "George Hartzell" wrote: >> >>> David Messina writes: >>>>> [Chris] >>>>> >>>>> I managed to get it working using file://. Haven't tried svn >>>>> +ssh yet >>>>> but I've had persistent problems getting ssh to work properly >>>>> on my >>>>> macbook; not sure why yet but I haven't had time to play around >>>>> with it. >>>> >>>> I just did a checkout and a test commit, both via svn+ssh -- works >>>> great for me. >>> >>> Is there anyone working outside of bioperl-{run,live,ext}? >>> >>> g. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 17:36:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:36:29 -0500 Subject: [Bioperl-l] Splits again, formerly Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be >>> released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I >>> correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get >>> out of >>> the circular dependency theoretically!? >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything >> in one go. > > Errr... I take that back. Since CPAN bundles install things in a > certain order, you just have to make sure that everything > Bio::ASN1::EntrezGene needs is installed first, then > Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, global- > style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be > altered to generate a release package per module instead of one big > release package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into > the feasibility of such a thing? Not for the time being, at least in my opinion. Too much on our plate at this point with svn migration, test conversion, bugzilla running over (next point of attack!), etc. Maybe something to think about after, though I like the idea of a few splits to core as Steve suggested (SearchIO, Graphics, some LWP-related DB modules). My (albeit extreme) thought is to have a lean-and-mean set of 'core' modules with as few external dependencies as possible, which could work around the circular dependency issue in this case: dep.on dep.on Bio::Auxiliary -----> ASN1::EntrezGene -----> core (with EntrezGene) (basic SeqIO, Index, DB, etc) \---->------>--- dep.on ->----->----->----/ Bioperl auxiliary modules would list core as a required dependency along with anything else needed for that particular aux. section (i.e. XML parsers, LWP, GD, etc.). The whole mess, if needed, would be installed using Bundle::BioPerl or similar, with no part released w/o testing on the whole 'base' to ensure proper interaction. If a fix needed to be made in one set, make the fix, test against bioperl 'base' as a whole, and release when possible. No need to wait for a full-fledged 1.5.3 release. Maybe wishful thinking... chris From cjfields at uiuc.edu Wed Jun 27 17:44:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:44:47 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> We should port them all, yes. chris On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > Is there a reason not to port every subproject over? > > -hilmar From cjfields at uiuc.edu Wed Jun 27 17:53:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:53:02 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: >> ... >> Is there much value in doing that? Does anyone want me to look >> into the >> feasibility of such a thing? > > > I think the value would be in other external modules being able to use > bioperl modules with more ease (not sure how many modules have, or > currently depend on bioperl) as they would depend on a single module, > rather than the whole package. However, how would the dependencies of > each module be handled? I'm clearly thinking aloud, but....Maybe this > would tease apart "cliques" of modules that are interdependent? and > could in themselves be shipped as bundles e.g. Bio::Graphics and > have a > "master" bioperl bundle that installa all the bioperl modules. See my response to Sendu, and Steve Chervitz's original post and related thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 which pretty much covers the same ground. I think at most 4-5 split 'cliques', including core, with the fewest possible dependencies in core. If we do any of this, it prob. should wait until after an svn migration and bugzilla bug stomping unless there is a (well-argued) advantage to doing it now. chris From n.haigh at sheffield.ac.uk Wed Jun 27 18:07:31 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 23:07:31 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> Message-ID: <4682DFA3.9090100@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: > >>> ... >>> Is there much value in doing that? Does anyone want me to look into the >>> feasibility of such a thing? >> >> >> I think the value would be in other external modules being able to use >> bioperl modules with more ease (not sure how many modules have, or >> currently depend on bioperl) as they would depend on a single module, >> rather than the whole package. However, how would the dependencies of >> each module be handled? I'm clearly thinking aloud, but....Maybe this >> would tease apart "cliques" of modules that are interdependent? and >> could in themselves be shipped as bundles e.g. Bio::Graphics and have a >> "master" bioperl bundle that installa all the bioperl modules. > > See my response to Sendu, and Steve Chervitz's original post and related > thread: > > http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315 > > which pretty much covers the same ground. I think at most 4-5 split > 'cliques', including core, with the fewest possible dependencies in > core. If we do any of this, it prob. should wait until after an svn > migration and bugzilla bug stomping unless there is a (well-argued) > advantage to doing it now. > > chris That's fine by me - or should I say, the best way forward - I was really just thinking aloud :) Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix TSi/e8PtYTwpxn6x+ewrjBs= =7Vp1 -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 27 18:43:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 23:43:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> Message-ID: <4682E824.1050507@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >> But the main problem with this approach is that maintenance, global- >> style code improvements and releases become a nightmare. I could, >> perhaps, imagine a scenario where the repository stayed as-is (one >> monolithic collection), but the dist action of Build.PL could be >> altered to generate a release package per module instead of one big >> release package of all modules, as is currently the case. >> >> Is there much value in doing that? Does anyone want me to look into >> the feasibility of such a thing? > > Not for the time being, at least in my opinion. Too much on our > plate at this point with svn migration, test conversion, bugzilla > running over (next point of attack!), etc. Maybe something to think > about after, though I like the idea of a few splits to core as Steve > suggested (SearchIO, Graphics, some LWP-related DB modules). [snip] > If a fix needed to be made in one set, make the fix, test against > bioperl 'base' as a whole, and release when possible. No need to > wait for a full-fledged 1.5.3 release. What advantage is there of these defined splits instead of individual modules? As I see it you lose some of the potential benefits of breaking Bioperl up completely, whilst also suffering the maintenance problems I outlined in my objection to Steve's post. Being able to work on all Bioperl from a single cvs (ne svn) check out/ archive, whilst distributing it as individual modules on CPAN seems like the best of both worlds to me. What am I missing? From hartzell at alerce.com Wed Jun 27 20:41:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:41:01 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> Message-ID: <18051.925.23313.932916@almost.alerce.com> Chris Fields writes: > [...] > We prob. should move it to a new directory ASAP which george can > write to when he needs to update. cvs is in /home/repository/ > bioperl, so maybe something similar, like /home/svn/repository/bioperl? I'd be parsimonious (lazy...) and go for /home/svn/bioperl. g. From hartzell at alerce.com Wed Jun 27 20:46:29 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:46:29 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <18051.1253.87485.235496@almost.alerce.com> Chris Fields writes: > [...] > Now how about a quick straw poll, what kind of access? svn+ssh is > already available, but some (Aaron among them) have indicated they > would like https as well (not sure how involved it would be to set up). What we do here, in large part, depends on what our host machine makes available to us. Is there an apache instance that we can use? Maybe a separate one? May someone among us configure it, or do we need to ask for help? (in other words, does anyone have sudo?) Is there some reason to not include http: (using Digest authentication so that passwords aren't passed in the clear?)? Maybe even go so far as to ask why bother with https:, it's not like we need to transfer any data encrypted.... g. From dmessina at wustl.edu Wed Jun 27 23:02:25 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 22:02:25 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > >> I would think we would want "Author Date Id Rev URL" set on >> everything, no?. So either cvs2svn or your tool (whichever you think >> is better), followed by >> >> svn propset svn:keywords "Author Date Id Rev URL" * > > Shouldn't this be done recursively? Yep, good catch! Thanks, Hilmar. Should be: svn propset --recursive svn:keywords "Author Date Id Rev URL" * From jason at bioperl.org Wed Jun 27 23:29:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:29:09 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.1253.87485.235496@almost.alerce.com> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: I think Chris D and I will need to confer a bit on https+svn. I don't know when we'll have a good chance to discuss everything. At some point this discussion is may need to be taken off bioperl and just the interested parties as we're delving into hardware geek land. The repository machine (dev) is a locked down machine meaning it only really runs ssh and not many servers include httpd. We have anonymous CVS (client and through httpd browsing) running on a separate machine (code) that has the info rsynced over every 10 or 15 minutes. The foundation websites and mailing lists run on a third machine (portal). If we decide to support https we'll need to spend a little time deciding how well we can keep it locked down - it will only be https not http for example and we may want to see about limiting ssh access to everyone if we migrate all OBF projects over to SVN and only support https. Again to re-iterate what I think we would do: - SVN read/write will live on 'dev', _WHEN_ we switch over no writes to the CVS repository. It will be available by ssh+svn and potentially by https+svn - SVN read-only will live on 'code', it will be accessible by http+svn - CVS read-only will live on 'code', this will only be a sync from the SVN to the CVS. See http://svn2cvs.tigris.org/ for details As I tried to ask for in the past, would someone also illustrate the importance of why _WE_ need to switch to SVN on a wiki page on Bioperl so that when someone complains/asks about this in the future the arguments are already laid out. I am basically fine with it, but I don't honestly see a compelling reason beyond what has been mentioned wrt better integration in IDEs. http://bioperl.org/wiki/Why_SVN -jason On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > Chris Fields writes: >> [...] >> Now how about a quick straw poll, what kind of access? svn+ssh is >> already available, but some (Aaron among them) have indicated they >> would like https as well (not sure how involved it would be to set >> up). > > What we do here, in large part, depends on what our host machine makes > available to us. > > Is there an apache instance that we can use? Maybe a separate one? > > May someone among us configure it, or do we need to ask for help? (in > other words, does anyone have sudo?) > > Is there some reason to not include http: (using Digest authentication > so that passwords aren't passed in the clear?)? Maybe even go so far > as to ask why bother with https:, it's not like we need to transfer > any data encrypted.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Wed Jun 27 23:51:32 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:51:32 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Hey guys - I'm wading in a bit late as I haven't had time to keep up with whole discussion. So you are suggesting 800+ individual CPAN modules? I don't think that is a good idea. Why would you split up Bio::Seq::RichSeq and Bio::Seq into two separate packages for example? I think if you really want to move away from the monolithic install it has to be more logical by function - but I am not that optimistic that this is going to actually be easier for people. Maybe I'm misunderstanding. What are the arguments for separating things -- to make it so people aren't scared by the number of modules so they'll code? It seems like some people just want it to be installed and run scripts - does having them install dozens of modules work. Do we need to consider people how much this would suck if someone can't use CPAN or Module::Builder to automate dependancy tracking installation? How does it work when modules are deprecated? I'm not sure I have made up my mind on what I'd like to see, but at some point I think we need to get a clearer idea of what audience we are trying to serve best. If want it to be easy to install maybe we should invest time into making OSX double-click installers, RPMs, and the Windows stuff easily installable. If we want to serve the developers who aren't using SVN so we want to push out releases of modules ASAP? I just am not clear on the motivation for some of the proposed changes. Also - the main point I wanted to make - Can I suggest we spend a little time discussing what it will take to get a stable release for the current code as it stands (bioperl-live and bioperl-run)? It seems like we really need to do this first so that we have a stable release that can be followed by CVS -> SVN migration, then consider major changes to the repository structure and release packaging, and potential deprecation and incorporation of other modules. I assume there is no chance that we'd have a 1.6 candidate by BOSC next month? Will it be productive to schedule a fair amount of time at BOSC discussing how to partition out the packages into separate sub- packages after we've done a successful release rather than trying to change things right now? I realize not everyone will be there but maybe it will be easier to interact on this then. I think it will also be time to talk with Lincoln/Scott about how Gbrowse is structured and if that is working for them. There is too much code in different places that I think we need to figure out how to structure it properly so those packages can be released. It would probably mean moving Bio::Graphics, Bio::DB::GFF and Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages so they could be released more regularly on par with Gbrowse schedules. Also I think someone needs to figure out Bio::Tools::GFF vs Bio::FeatureIO -- what do we want to do? I don't think we really fully support GFF3 that well -- the X2GFF scripts probably need some more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, etc... ) and or migration to the proper GFF writing. -jason On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >>> But the main problem with this approach is that maintenance, global- >>> style code improvements and releases become a nightmare. I could, >>> perhaps, imagine a scenario where the repository stayed as-is (one >>> monolithic collection), but the dist action of Build.PL could be >>> altered to generate a release package per module instead of one big >>> release package of all modules, as is currently the case. >>> >>> Is there much value in doing that? Does anyone want me to look into >>> the feasibility of such a thing? >> >> Not for the time being, at least in my opinion. Too much on our >> plate at this point with svn migration, test conversion, bugzilla >> running over (next point of attack!), etc. Maybe something to think >> about after, though I like the idea of a few splits to core as Steve >> suggested (SearchIO, Graphics, some LWP-related DB modules). > [snip] >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of individual > modules? As I see it you lose some of the potential benefits of > breaking > Bioperl up completely, whilst also suffering the maintenance > problems I > outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ > archive, whilst distributing it as individual modules on CPAN seems > like > the best of both worlds to me. What am I missing? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From chris at bioteam.net Thu Jun 28 00:08:25 2007 From: chris at bioteam.net (Chris Dagdigian) Date: Thu, 28 Jun 2007 00:08:25 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net> My understanding of "https+svn" is that it is actually WebDAV-over- HTTP which means that not only would we need to light up a HTTPD server on the developer box we'd also have to get a stable mod_dav module installed (sometimes not trivial) and then we would have to figure out how to handle the authentication bits. Right now with SSH we use Unix group permissions to figure out who can write to what repository -- WebDAV makes this a lot more complicated. Forcing encryption over https will prevent someone from sniffing a developer password which removes the main security issue. The next problem is going to be integrating the DAV module with Linux PAM so that existing usernames and passwords can be used, -OR- we have to set up and maintain an entirely separate set of username and password maps for each developer and each SVN project. I'm not super concerned about this -- BioTeam runs svn internally and we expose our SVN for employees both via WebDAV and SVN+SSH - it's not that hard to set up. My biggest concern really has to do with how much extra work this will mean for the OBF sysadmin team. If there is an easy way to get a stable Apache/DAV/SVN integration going with authentication coming from Linux PAM then this is no big deal. If we have to manually maintain separate authentication lists then it will be kind of a hassle. Like Jason mentioned, the OBF currently segregates "stuff" onto three different servers with three levels of security: - dev.open-bio.org -- Developers only, SSH access only (main sourcecode repository for OBF) - portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers and helpdesk.open-bio.org - code.open-bio.org -- "Disposable" anonymous access server that we can easily burn/wipe/reinstall if it ever gets hacked Everything else that Jason mentioned is fine and easy to set up (if not already running): - SVN+SSH for developers - Anonymous SVN and Anonymous RSYNC for community access on code.open-bio.org - svn2cvs for whomever wants it on code.open-bio.org - web based SVN code browser installed on http://code.open-bio.org Regards, Chris On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote: > I think Chris D and I will need to confer a bit on https+svn. I > don't know when we'll have a good chance to discuss everything. At > some point this discussion is may need to be taken off bioperl and > just the interested parties as we're delving into hardware geek land. > > The repository machine (dev) is a locked down machine meaning it > only really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or > 15 minutes. The foundation websites and mailing lists run on a > third machine (portal). > > > If we decide to support https we'll need to spend a little time > deciding how well we can keep it locked down - it will only be > https not http for example and we may want to see about limiting > ssh access to everyone if we migrate all OBF projects over to SVN > and only support https. > > Again to re-iterate what I think we would do: > - SVN read/write will live on 'dev', _WHEN_ we switch over no > writes to the CVS repository. It will be available by ssh+svn and > potentially by https+svn > - SVN read-only will live on 'code', it will be accessible by http > +svn > - CVS read-only will live on 'code', this will only be a sync from > the SVN to the CVS. See http://svn2cvs.tigris.org/ for details > > > As I tried to ask for in the past, would someone also illustrate > the importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the > future the arguments are already laid out. I am basically fine > with it, but I don't honestly see a compelling reason beyond what > has been mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN > > -jason > On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > >> Chris Fields writes: >>> [...] >>> Now how about a quick straw poll, what kind of access? svn+ssh is >>> already available, but some (Aaron among them) have indicated they >>> would like https as well (not sure how involved it would be to >>> set up). >> >> What we do here, in large part, depends on what our host machine >> makes >> available to us. >> >> Is there an apache instance that we can use? Maybe a separate one? >> >> May someone among us configure it, or do we need to ask for help? >> (in >> other words, does anyone have sudo?) >> >> Is there some reason to not include http: (using Digest >> authentication >> so that passwords aren't passed in the clear?)? Maybe even go so far >> as to ask why bother with https:, it's not like we need to transfer >> any data encrypted.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > From cjfields at uiuc.edu Thu Jun 28 00:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 23:18:03 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: > Chris Fields wrote: > ... >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of > individual modules? As I see it you lose some of the potential > benefits of breaking Bioperl up completely, whilst also suffering > the maintenance problems I outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ archive, whilst distributing it as individual modules on CPAN > seems like the best of both worlds to me. What am I missing? Okay, forewarned, but here's my long-winded reasoning. The short and sweet version: I (very) respectfully don't agree with you, at least re: the idea we should commit all modules to CPAN independently. It doesn't make any sense to me, but maybe you can elaborate more? Maybe I'm misinterpreting what you mean? Also, I agree with Steve C. that core is anything but a representation of a 'core' set of modules, and some sections could (should?) be split off into discrete, cohesive units. We may be alone in that camp, though it doesn't seem so (it's popped up more than a few times, in one form or another). If you want an in-depth explanation for both opinions, read on (below my sig), or feel free to bypass it. I'll understand. Finally, all of this should wait until later. Much later, like after a decent release, after svn, etc kind of 'later'. I think we can agree on that. . . . . . Still here? Okay... each issue (skip as needed): Individual CPAN modules: CPAN is not our personal versioning system; it may be if a distribution consists of only a few modules, but not when it's one of the largest distros present. If someone wants to update an individual bioperl module for a quick bug fix they are more than welcome to download it via cvs, svn, or even using a web browser, and replace the one they have. In most cases, it works w/o problems. With Module::Build you have even made it easier if a full installation is necessary. I'm trying to reason how one could break up the individual SeqIO/ SearchIO/otherIO modules into single module distributions. They are intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, which relies on the various interfaces, RootIO, and on down). How would tests be run off CPAN when the modules are distributed independently? Would they also be individually distributed? What would you use to tie all the individual modules together? How would you explain to the CPAN maintainers that you want to split bioperl into 990 individual modules, all updated independently, but intend on bundling them afterwards anyway? I'm failing to see the advantages to this approach, but if you can find an example where this was done successfully on CPAN or elsewhere maybe I could see what you mean. Splitting up core: As I see it, here are the advantages of a defined split as Steve and I see it (off the top of my head). Some of this probably reiterates my previous points, as well as Steve's, so apologies in advance. - A lean, mean, focused set of bioperl base modules (core) w/o or with very few external deps, minimal installation issues, etc. The very basic stuff to get up and running. - BioPerl bundled modules (Nathan's 'cliques') with defined, focused functionality, code, and tests, which add a bit more 'sugar' to the base functionality of the core. If you only care about parsing BLAST reports, get SearchIO, which requires core and optionally other modules (XML::SAX). If you want additional DB functionality apart from the very basic ones in core, install DB (with it's additional requirements, including core, DBI, and so on). Same with Graphics, Tools, Tree/Phylo, etc. We just need to define and limit the number of splits. - Easier to add additional bundled modules. For instance, I could focus all of my RNA work into a discrete set of modules (say, bioperl- rna) which I maintain, I ensure works with the latest core code, I ensure also plays well with the other children =) , and I distribute via CPAN. Same with EUtilities, which could go into a separated DB- related set or stay in core. - If we want a full-fledged 'install everything', the CPAN Bundle system is available. I think it's easier to use a Bundle for 4-5, even 10 groups of modules as opposed to over 900. - A Bundle or a build file where discrete distributions are listed (Bio::SearchIO, etc) wouldn't need to be updated every time a new module is added to a distribution. I suppose this could be automated, but why have the additional headache? - A chance to cut out some cruft. We all know that particular areas need work or a complete overhaul (Restriction, Structure, maybe a few others). Smaller, concentrated sets of modules I believe would be easier to maintain, and those that don't get use will eventually fall out of favor and may be lost or replaced from the more maintained group of modules. Survival of the fittest. - We already have had practice; bioperl-db, bioperl-run, bioperl- network, and others. Those that have been routinely maintained and enjoy wide use (db, run, network) have survived; others not so much (corba-related stuff, microarray, ext, etc., though the code is still available if someone else wants to take it up and revive it!). Disadvantages of a defined split: - The initial headache of identifying which groups go where, coordinating with those who rely on bioperl (GMOD, etc) on how this will be set up, so on... - Separate groups of modules require testing together to ensure functionality is consistent and maintained (something I think you pointed out previously). - I think an increased possibility of branching is possible. - Extra headaches for devs, who have to keep track of the various critical distributions and make sure they work well together. - Maybe others, but it's getting late here. Add more as needed; I'm sure there are a number more. chris From cjfields at uiuc.edu Thu Jun 28 01:17:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 00:17:01 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu> D'oh! Just when I wanted to go to bed. It's not fair, you're in California... On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote: > Hey guys - I'm wading in a bit late as I haven't had time to keep up > with whole discussion. > > So you are suggesting 800+ individual CPAN modules? I don't think > that is a good idea. Why would you split up Bio::Seq::RichSeq and > Bio::Seq into two separate packages for example? I think if you > really want to move away from the monolithic install it has to be > more logical by function - but I am not that optimistic that this is > going to actually be easier for people. Maybe I'm misunderstanding. Okay, so maybe it wasn't just me. > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? What I envision for core is maybe not just one distribution, but a cluster of distributions: base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated modules. Bare bones, with as few dependencies as possible. aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires additional modules. search - Bio::Search and SearchIO tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related stuff? graphics - Bio::Graphics. Maybe GMOD-related stuff here? The last four would list bioperl-core as a dependency themselves along with any other modules necessary. We could also have the core Build.PL ask the user if they want to install the other non-base distros, and maybe include bioperl-db, bioperl-network, and bioperl- run in the loop if requested. All would be installed as a bundle similar to Bundle::BioPerl, but have regular CPAN point releases (1.x.x) independently from one another i.e. for bug fixes, with a yearly/biyearly timed full release (1.x) of the whole shebang. Any point release for any 'core' distribution would have to be tested against the others prior to release. This is basically following Steve's train of thought, though more elaborated: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 > I'm not sure I have made up my mind on what I'd like to see, but at > some point I think we need to get a clearer idea of what audience we > are trying to serve best. If want it to be easy to install maybe we > should invest time into making OSX double-click installers, RPMs, and > the Windows stuff easily installable. If we want to serve the > developers who aren't using SVN so we want to push out releases of > modules ASAP? I just am not clear on the motivation for some of the > proposed changes. I think regular CPAN releases with updated PPMs hosted via portal work fine for the most part, but it would be nice to host RPMs. Others (Allen Day, for instance) have donated time to generate RPMs but they seem to lag behind a bit more. The original idea for svn arose from an unrelated thread with Mark Johnson discussing something (Glimmer maybe?) and took off from there. I was actually pretty surprised it took on a life of it's own. As for the motivation to switch, I haven't specifically used it myself, but the large number of responses seem to indicate others have and seem happy with it. Rutger Vos had also indicated he would move Bio::Phylo over to the repo if we used svn. We def. should address the issues you bring up (why _WE_ need svn) more succinctly but that shouldn't be an issue. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. Agreed. We prob. need to schedule a good couple of days (or so) to squash bugs. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? Um, not likely as nothing has been addressed Feature/Annotation-wise (overloads are still there, methods have not been deprecated, etc). There was an underlying assumption these would have an effect on GMOD- related stuff (I remember reading a post from Scott Cain in the mail archive mentioning something along these lines after the 1.5 release hubbub). Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall? > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I realize not everyone will be there but > maybe it will be easier to interact on this then. How many are going to be there? I can't go this year except on my own dime (which I don't have many of, student loans and all, sorry), though I'll likely be in a new lab by spring which is likely more amenable to funding. If there is a hackathon in the late fall (post- sept) I'll make it a point to go regardless. > I think it will also be time to talk with Lincoln/Scott about how > Gbrowse is structured and if that is working for them. There is too > much code in different places that I think we need to figure out how > to structure it properly so those packages can be released. It would > probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I don't think we really > fully support GFF3 that well -- the X2GFF scripts probably need some > more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, > etc... ) and or migration to the proper GFF writing. > > > -jason Will Lincoln or Scott be at BOSC? chris From dmessina at wustl.edu Thu Jun 28 01:21:58 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 00:21:58 -0500 Subject: [Bioperl-l] finding statistics on AA In-Reply-To: <4681F4B4.8010609@pacific.net.sg> References: <4681F4B4.8010609@pacific.net.sg> Message-ID: Hi Melvin, I don't think BioPerl has any information content-related code. I'm not terribly familiar with it myself, but the usual recommendation is to look at the EMBOSS package: http://en.wikipedia.org/wiki/EMBOSS Dave From bix at sendu.me.uk Thu Jun 28 02:38:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 07:38:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <46835778.5070901@sendu.me.uk> Jason Stajich wrote: > So you are suggesting ou are suggesting 800+ individual CPAN modules? > I don't think that is a good idea. Why would you split up > Bio::Seq::RichSeq and Bio::Seq into two separate packages for > example? I think if you really want to move away from the monolithic > install it has to be more logical by function - but I am not that > optimistic that this is going to actually be easier for people. > Maybe I'm misunderstanding. > > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? See my upcoming reply to Chris. Briefly, if the only change is to the dist action of Build.PL, we can make a single archive of all modules available to non-CPAN users, and individual modules available to CPAN users. No problems. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I'd recommend that a 'stable' release shouldn't happen until we resolve all the missing tests and bugzilla bugs (because I think the opportunity should be taken to have it stable both in terms of interface /and/ bugs). Which is a lot of work. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? None. From bix at sendu.me.uk Thu Jun 28 03:25:03 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 08:25:03 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <4683624F.6020402@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >> What advantage is there of these defined splits instead of >> individual modules? As I see it you lose some of the potential >> benefits of breaking Bioperl up completely, whilst also suffering >> the maintenance problems I outlined in my objection to Steve's post. >> >> Being able to work on all Bioperl from a single cvs (ne svn) check >> out/ archive, whilst distributing it as individual modules on CPAN >> seems like the best of both worlds to me. What am I missing? > > Okay, forewarned, but here's my long-winded reasoning. The short and > sweet version: I (very) respectfully don't agree with you, at least > re: the idea we should commit all modules to CPAN independently. It > doesn't make any sense to me, but maybe you can elaborate more? > Maybe I'm misinterpreting what you mean? The short and sweet version: my proposal has all the benefits of yours, but none of the disadvantages. What's not to like? > Finally, all of this should wait until later. Much later, like after > a decent release, after svn, etc kind of 'later'. I think we can > agree on that. Hmm, not really. If it can be implemented by a change in just Build.PL and ModuleBuildBioperl, its really independent of everything else. That's the beauty of it: the only thing that changes is how things are uploaded to and downloaded from CPAN. The only person that normally deals with that issue is the pumpkin for a release, and he only cares about it at release time. In fact, if we're going to do it at all it makes sense to try it out on a minor release like 1.5.3. We've already got experience of doing it split-style from 1.5.2. (And let me tell you: splits at the code-base level suck.) > Individual CPAN modules: > > CPAN is not our personal versioning system; it may be if a > distribution consists of only a few modules, but not when it's one of > the largest distros present. If someone wants to update an > individual bioperl module for a quick bug fix they are more than > welcome to download it via cvs, svn, or even using a web browser, and > replace the one they have. And where is the harm in letting them do it via CPAN as well? In fact, there are significant benefits: > I'm trying to reason how one could break up the individual SeqIO/ > SearchIO/otherIO modules into single module distributions. They are > intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, > which relies on the various interfaces, RootIO, and on down). How > would tests be run off CPAN when the modules are distributed > independently? Bio::SeqIO::genbank would have a dependency on the latest version of Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. So when a user wants to get the latest version of Bio::SeqIO::genbank, they no longer have to worry about what other modules in its dependency hierarchy they should also install. Instead they just request Bio::SeqIO::genbank which itself ensures you have the latest version of all its dependencies before installing itself and running its tests. When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank users should have, he could just call './Build dist Bio::SeqIO::genbank' which would generate a new package for Bio::SeqIO::genbank suitable for uploading to CPAN. No more long release cycles and having to constantly tell people to 'use CVS' to get working Bioperl code. > Would they also be individually distributed? What > would you use to tie all the individual modules together? How would > you explain to the CPAN maintainers that you want to split bioperl > into 990 individual modules, all updated independently, but intend on > bundling them afterwards anyway? They would be tied together by a CPAN bundle. You don't have to 'explain' anything to the CPAN maintainers because you're not doing anything wrong. In fact, you're using it the way you're supposed to. > Splitting up core: > > As I see it, here are the advantages of a defined split as Steve and > I see it (off the top of my head). Some of this probably reiterates > my previous points, as well as Steve's, so apologies in advance. Below I answer with how it would be with my single-module approach compared to the defined splits. > - A lean, mean, focused set of bioperl base modules (core) w/o or > with very few external deps, minimal installation issues, etc. The > very basic stuff to get up and running. Even leaner, even more focused. > - BioPerl bundled modules (Nathan's 'cliques') with defined, focused > functionality, code, and tests, which add a bit more 'sugar' to the > base functionality of the core. If you only care about parsing BLAST > reports, get SearchIO, which requires core and optionally other > modules (XML::SAX). If you want additional DB functionality apart > from the very basic ones in core, install DB (with it's additional > requirements, including core, DBI, and so on). Same with Graphics, > Tools, Tree/Phylo, etc. We just need to define and limit the number > of splits. The same can be achieved with CPAN bundles for each kind of functional grouping you can think of. And since its just a single text file that defines such a grouping, its easy to change or add new ones as you feel like it, as opposed to the rather more permanent and substantial effort of creating one of your splits on the code-base level. Also, the world doesn't have to rely on /our/ ideas of what a useful functional split is. If someone just wants to parse Blast results, they can just use CPAN to install Bio::SearchIO::blast_pull instead of having to install all of SearchIO. > - Easier to add additional bundled modules. For instance, I could > focus all of my RNA work into a discrete set of modules (say, bioperl- > rna) which I maintain, I ensure works with the latest core code, I > ensure also plays well with the other children =) , and I distribute > via CPAN. Same with EUtilities, which could go into a separated DB- > related set or stay in core. And if you lose interest in them? They eventually die because they no longer have someone looking after them by default (the pumpkin and other devs). Alternatively you could just make a CPAN bundle. One text file! Easy! No duplication of modules in CPAN, no new hassle for you or the Bioperl 'core' pumpkin to ensure that the latest version of each work with each other and other splits. > - If we want a full-fledged 'install everything', the CPAN Bundle > system is available. I think it's easier to use a Bundle for 4-5, > even 10 groups of modules as opposed to over 900. No, it isn't any easier. Its /equally/ easy to install a bundle of 900 packages of 900 modules as it is to install 5 packages of 900 modules. When not installing absolutely everything, but perhaps 'most' things, there's the additional benefit that it would be easier to skip a particular Bio::module because you didn't want to install its external dependencies and weren't that interested in it anyway. > - A Bundle or a build file where discrete distributions are listed > (Bio::SearchIO, etc) wouldn't need to be updated every time a new > module is added to a distribution. I suppose this could be > automated, but why have the additional headache? Yes, it would be automated, and no, it wouldn't at all be any kind of additional headache. I'm proposing a fully-automated system that the pumpkin wouldn't even have to think about it. Much /less/ of a headache than dealing with splits. Orders of magnitude easier to deal with. > - A chance to cut out some cruft. We all know that particular areas > need work or a complete overhaul (Restriction, Structure, maybe a few > others). Smaller, concentrated sets of modules I believe would be > easier to maintain, and those that don't get use will eventually fall > out of favor and may be lost or replaced from the more maintained > group of modules. Survival of the fittest. And the smallest, most concentrated set of modules is the individual module. > - We already have had practice; bioperl-db, bioperl-run, bioperl- > network, and others. Those that have been routinely maintained and > enjoy wide use (db, run, network) have survived; others not so much > (corba-related stuff, microarray, ext, etc., though the code is still > available if someone else wants to take it up and revive it!). The reason some of these existing splits (micoarray, ext) have fallen by the way-side? /Because/ they're splits. If they had been part of bioperl-live all along, they'd have been kept in a working, compatible state and would have been released along with everything else in 1.5.2 > Disadvantages of a defined split: > > - The initial headache of identifying which groups go where, > coordinating with those who rely on bioperl (GMOD, etc) on how this > will be set up, so on... No need to worry about this with individual modules. > - Separate groups of modules require testing together to ensure > functionality is consistent and maintained (something I think you > pointed out previously). No need to worry. > - I think an increased possibility of branching is possible. > > - Extra headaches for devs, who have to keep track of the various > critical distributions and make sure they work well together. No headaches. From charles-listes+bioperl at plessy.org Thu Jun 28 03:40:04 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 16:40:04 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? Message-ID: <20070628074004.GD6338@kunpuu.plessy.org> Dear developpers, I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if it would make sense to call it "bioperl-live" and distribute it in parallel with the stable 1.4.0 version, if bioperl-live means "the current developepr version". If I am wrong, can somebody explain me what bioperl-live exactly refers to ? Have a nice day, -- Charles Plessy Debian-med packaging team Wako, Saitama, Japan From n.haigh at sheffield.ac.uk Thu Jun 28 04:23:10 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:23:10 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46836FEE.5030203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. This was my thinking when I first brought this up at the begining/splitting of this thread. This way of thinking of modules as the constituent parts of a larger package should make it easier for people to define dependencies far easier as well as users only needing to install those parts they require. As Sendu points out, if the user wants to convert seqs from genbank to fasta they could simply install Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the other modules that are the dependencies of Bio::SeqIO::genbank and Bio::SeqIO::fasta. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. However, how would the test suite work out with this? e.g. when someone installs Bio::SeqIO::genbank they want to have the tests associated with Bio::SeqIO::genbank to be run. Would there be tests that would be run redundantly if for example someone installed Bio::SeqIO::genbank and Bio::SeqIO::fasta? > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. Yep. real modules are released as modules, each with their own set of dependencies. The use CPAN bundles the way there were supposed to be for - - distributing a set of CPAN modules that make a coherent set of functionality. You "could" also bundle in other authors modules e.g. Bio::ASN1::EntrezGene? > > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. Hmm, how would module versions be handled? Wouldn't this approach require each module to have it's own independent version number, which could then be used for building the dependencies? Each new release of that module would only bump that module's version number. Bundles can specify the minimum version of a module to be installed, such that bug fixes to individual modules and be released into CPAN and would automatically get picked up when installing bundles etc. I'm not quite sure how the current stable/dev releases would work. I assume bug fixes would have to be made on a branch e.g. branch 1.6 and released to cpan from there. Then when the next stable release is made, all module versions would be bumped and and released to CPAN. With any modifications to the content of the bundle to be made. Is it possible to have a stable and developer release bundles that are able to specify the minimum stable and developer modules versions respectively? > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. Maye need to worry aout how the tests are run when installing individual modules etc? > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT VkymyXNshguE44/RilEXWDA= =O5ex -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 04:27:54 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:27:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683710A.9010808@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. > The successor to Bundles - may prove interesting: http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r r/BykCKbM9lqJM0khARuEms= =NB4B -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 04:51:19 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:51:19 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837687.7010101@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Charles Plessy wrote: > Dear developpers, > > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? > > Have a nice day, > bioperl-live really means the HEAD of the cvs repository so is the most bleeding-edge code available. Version 1.5.* is the developer release, while the 1.4.* is the stable release. However, there have been few updates to the 1.4.* release which means that it is more unstable than the 1.5.* dev release. I think the consensus, was to have more rapid release cycles of the stable branch in future in order to avoid this. I'm sure there are others more qualified to expand/correct me on this if needs e. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB /fHFyYkqAvcmOSxu4djPll0= =KwVH -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 05:11:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 10:11:39 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <46836FEE.5030203@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk> Message-ID: <46837B4B.7060705@sendu.me.uk> Nathan S. Haigh wrote: (Please try and snip more: don't quote whole posts just to reply to certain paragraphs) > Sendu Bala wrote: >> Chris Fields wrote: >> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank >> users should have, he could just call './Build dist Bio::SeqIO::genbank' >> which would generate a new package for Bio::SeqIO::genbank suitable for >> uploading to CPAN. No more long release cycles and having to constantly >> tell people to 'use CVS' to get working Bioperl code. > > However, how would the test suite work out with this? e.g. when someone > installs Bio::SeqIO::genbank they want to have the tests associated with > Bio::SeqIO::genbank to be run. Would there be tests that would be run > redundantly if for example someone installed Bio::SeqIO::genbank and > Bio::SeqIO::fasta? We would want to move to a strict test-script-per-module system. But that's desirable in any case, as it would greatly ease reaching our goal of complete test coverage, and subsequent maintenance of those tests. The genbank test would only run tests specific to genbank parsing, and likewise for fasta. They would both have a dependency on Bio::SeqIO, and if that was also recently updated, it would get installed prior to you installing genbank (and therefor run its own generic SeqIO tests), but wouldn't get installed again (wouldn't run its tests again) when you install fasta afterwards. On the subject of tests, I'm reminded of another benefit of the individual-module approach. Currently if a test fails during a CPAN install, nothing gets installed. Users do one of: # refuse to install at all (strict sys-admins) # cry and give up (newbies) # cry and seek help (newbies who really really need Bioperl) # force install, leaving them in some undefined state because they didn't understand the problems (most remaining users) # force install, happy that the problems are ok (some Bioperl devs) With a bundle of individual modules you would install virtually all Bioperl modules with no problems, and the problems with the remainder would be clear to everyone. No one would need to force install since the tests results would now be meaningful: the thing you're trying to install really isn't going to work if the tests are failing. If you really needed that particular Bioperl module you could then pay particular attention to why its failing (most likely some problem with an external dependency). >>> Would they also be individually distributed? What would you use to >>> tie all the individual modules together? >> >> They would be tied together by a CPAN bundle. You don't have to >> 'explain' anything to the CPAN maintainers because you're not doing >> anything wrong. In fact, you're using it the way you're supposed to. > > Yep. real modules are released as modules, each with their own set of > dependencies. The use CPAN bundles the way there were supposed to be for > - - distributing a set of CPAN modules that make a coherent set of > functionality. You "could" also bundle in other authors modules e.g. > Bio::ASN1::EntrezGene? Any bundle featuring Bio::SeqIO::entrezgene would necessarily include Bio::ASN1::EntrezGene in the bundle. > Hmm, how would module versions be handled? Wouldn't this approach > require each module to have it's own independent version number, which > could then be used for building the dependencies? Each new release of > that module would only bump that module's version number. Yes, that's how it would work. No more global version number. > Bundles can specify the minimum version of a module to be installed, > such that bug fixes to individual modules and be released into CPAN and > would automatically get picked up when installing bundles etc. Yes. > I'm not quite sure how the current stable/dev releases would work. I > assume bug fixes would have to be made on a branch e.g. branch 1.6 and > released to cpan from there. Then when the next stable release is made, > all module versions would be bumped and and released to CPAN. With any > modifications to the content of the bundle to be made. Is it possible to > have a stable and developer release bundles that are able to specify the > minimum stable and developer modules versions respectively? No, the distinction becomes pretty meaningless. We could still do big major releases, but modules wouldn't be version-bumped. The big release would just be an update of the bundle that specifies the latest version of all Bioperl modules. Remember that bundles only specify the minimum version, not the required version: in this brave new world users would end up with the same versions of modules if they installed a 1.8 bundle compared to 1.7 bundle. The only way to get a true snapshot of 1.7 after it was released would be if we took snapshots and archived them, making them available from bioperl.org (or by checking out the 1.7 tag from cvs/svn). I don't see that as a significant problem. You lose the trivial benefit of being able to install old snapshots from CPAN. The people who have a great need to install old snapshots can find their way to bioperl.org no problem. From bix at sendu.me.uk Thu Jun 28 04:50:09 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 09:50:09 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837641.8050106@sendu.me.uk> Charles Plessy wrote: > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? bioperl-live is the name of the CVS repository containing what is currently considered the 'Core package' or core modules. http://www.bioperl.org/wiki/Using_CVS If you want to call it something to distinguish it from stable, call it 'developer' vs 'stable' or '1.5.2' vs '1.4.0'. To distinguish them both from the other packages, call them 'core' vs 'run' etc. From hlapp at gmx.net Thu Jun 28 06:31:29 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:31:29 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > [...] Also - the main point I wanted to make - Can I suggest we > spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I agree we need to discuss a path towards 1.6, but I think that should be kept separate from the cvs->svn migration. Otherwise one stalls the other (by stopping people who seem to have the energy and motivation right now to do one but not the other) for no really good reason. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? I'm not sure that's feasible to be happening but if someone steps up it maybe it is. > > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I agree. I also don't think that people are partitioning right now (other than the existing partitioning), though maybe I'm mistaken. > [...] > It would probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Possibly. I'm not fully sure why those modules couldn't also be released more often out of the "main trunk" of modules. In Java/ant, it'd be relatively easy to write build script filters that select the appropriate modules and package them on the fly. I'm not sure whether the build tools for Perl can do that too, though. > Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I believe FeatureIO has the ontology download tied into it? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Jun 28 06:47:39 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:47:39 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote: > As I tried to ask for in the past, would someone also illustrate the > importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the future > the arguments are already laid out. I am basically fine with it, but > I don't honestly see a compelling reason beyond what has been > mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN I guess at the end of the day svn is just the system of choice for new developers. I've had people tell me who started with svn that cvs seems a lot harder to use. The newer projects are all on svn and for example to integrate Bio::Phylo into BioPerl should become a question of the revision control system. At the end of the day if being on svn makes it easier for new people to contribute it's enough of an argument for me, whether it's rational or not. IMHO, there's two advantages that svn has over cvs. First, directories are versioned, have properties, and generally are the same class of citizens as files. They can be added, renamed, and removed from the repository. In cvs, we all know what a hassle it is to rename or even retire directories. Second, svn log gives you the commits, i.e., the set of changes that constituted one particular commit (and therefore version increase). In cvs that's hard or impossible to reconstruct. Bottom line - I don't think many people if any will question why we moved from cvs to svn ... My $0.02 ... -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 20:34:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:34:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> Message-ID: <18051.541.684705.567954@almost.alerce.com> Chris Fields writes: > We should port them all, yes. > > chris > > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > > > Is there a reason not to port every subproject over? > > > > -hilmar They're all there. At least everything that I found in the CVS repo. Some of the directories were empty, some had very little content, I was just mechanical about it. Here's what I have: [hartzell at dev ~]$ svn ls file://`pwd`/bioperl biodata/ bioperl-cookbook/ bioperl-corba-client/ bioperl-corba-server/ bioperl-das-client/ bioperl-db/ bioperl-ext/ bioperl-gui/ bioperl-live/ bioperl-microarray/ bioperl-network/ bioperl-papers/ bioperl-pedigree/ bioperl-pipeline/ bioperl-run/ biosql-schema/ html/ task-manager/ xml-html/ I wasn't very clear in my original request, but I was hoping that someone out there who's familiar with the various out-of-the-way bits and pieces could take a look at them. I was afraid that everyone was just checking out bioperl-live and doing 'make test'. Someone (chris?) made a point about binary files in bioperl-run. It'd be great if someone in the know could check on them. Also, to the degree that it's possible, look around at various tags and branches and see if they're what you'd expect. Thanks! g. From bix at sendu.me.uk Thu Jun 28 08:21:37 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 13:21:37 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <4683A7D1.8070403@sendu.me.uk> George Hartzell wrote: > Chris Fields writes: > > [...] > > It looks like George Hartzell may be taking a crack at it, with > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > could have something testable relatively soon. After that we'll need > > to work out a few other issues, basically what's on Hilmar's list. > > There's a repository on file:///home/hartzell/bioperl with all of the > components projects in place. > > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl I'm confused. Presumably that only works whilst logged into dev.open-bio.org? > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I just tried: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl on Mac OS X and things seemed to go well, except for this error message at the end: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory I also ended up with only: bioperl-corba-server bioperl-db bioperl-live bioperl-network bioperl-papers biosql-schema Am I doing something totally wrong here? From hartzell at alerce.com Thu Jun 28 08:32:36 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:32:36 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.43620.481558.447399@almost.alerce.com> Jason Stajich writes: > [...] > The repository machine (dev) is a locked down machine meaning it only > really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or 15 > minutes. A great way to provide a read-only mirror of the repos. for anonymous users is to have svnsync running out of cron on code.open-bio.org, configured to pull from the dev.open-bio.org repository. It might actually work to have rsync mirror the fsfs-backed repository, but that's scary-poking-into-the-internals. g. From hartzell at alerce.com Thu Jun 28 08:43:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:43:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18051.44281.831316.749586@almost.alerce.com> David Messina writes: > > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > > > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > > > >> I would think we would want "Author Date Id Rev URL" set on > >> everything, no?. So either cvs2svn or your tool (whichever you think > >> is better), followed by > >> > >> svn propset svn:keywords "Author Date Id Rev URL" * > > > > Shouldn't this be done recursively? > > > Yep, good catch! Thanks, Hilmar. > > Should be: > > svn propset --recursive svn:keywords "Author Date Id Rev URL" * That's not quite what you want either. It'll set the the keyword property on all of the files, including things where you probably don't want expansion to happen (e.g. images, someone said there are binary wads in bioperl-run, etc...). The Right Thing To Do is to grub around (grep) for '\$Id:' (and the others) and set svn:keywords to files that are already using keywords. I have a bourne shell hack that'll do this, although it's painful because it has to run in working directories.... Once we settle on a list of keywords to use, I'll take a wack at the demo repository. Likewise, you probably DON'T want to use this in your config file: enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" since it'll do the same thing. The Right Thing To Do is a more tedious *.pl = svn:keywords="Author Date Id Rev URL" *.pm = svn:keywords="Author Date Id Rev URL" *.c = svn:keywords="Author Date Id Rev URL" A bit of googling will give you a good starting point for the list, and we should probably maintain a common one somewhere in the repo. I don't think that there's a server side way of doing this, short of running some script via a hook around commit time. g. From hartzell at alerce.com Thu Jun 28 08:54:40 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:54:40 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.44944.982207.37624@almost.alerce.com> Hilmar Lapp writes: > [...] > IMHO, there's two advantages that svn has over cvs. First, > directories are versioned, have properties, and generally are the > same class of citizens as files. They can be added, renamed, and > removed from the repository. In cvs, we all know what a hassle it is > to rename or even retire directories. Second, svn log gives you the > commits, i.e., the set of changes that constituted one particular > commit (and therefore version increase). In cvs that's hard or > impossible to reconstruct. Two more: - svn groups changes into revisions, so that they can be considered together, CVS versions individual files. - subversion tracks renames/moves correctly, - subversion commits are atomic, so you never have to worry about all of your stuff making it into the repos. at the same time [if you've never had to un-muck this, count yourself blessed!] , - svk, which allows disconnected development while still commiting your work to a repo at natural points along the way (you can revert, branch, etc.... to your hearts content). [yeah, that's 3, err, 4. Math is hard.] g. From cjfields at uiuc.edu Thu Jun 28 09:07:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:07:24 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu> On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > >> ...It >> seems like we really need to do this first so that we have a stable >> release that can be followed by CVS -> SVN migration, then consider >> major changes to the repository structure and release packaging, and >> potential deprecation and incorporation of other modules. > > I agree we need to discuss a path towards 1.6, but I think that > should be kept separate from the cvs->svn migration. Otherwise one > stalls the other (by stopping people who seem to have the energy and > motivation right now to do one but not the other) for no really good > reason. It's good to discuss it as long as it doesn't take time and energy away from other priorities. >> I assume there is no chance that we'd have a 1.6 candidate by BOSC >> next month? > > I'm not sure that's feasible to be happening but if someone steps up > it maybe it is. Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after. Then maybe work on partitioning if everyone's up for it and a scheme is worked out. >> Will it be productive to schedule a fair amount of time at BOSC >> discussing how to partition out the packages into separate sub- >> packages after we've done a successful release rather than trying to >> change things right now? > > I agree. I also don't think that people are partitioning right now > (other than the existing partitioning), though maybe I'm mistaken. The original proposal was based on Steve's idea of splitting up core. I don't think a partition is feasible at this point, at least until we put more thought into it (our energy should be focused elsewhere), but it's well worth discussing as a future path. At this time there are two proposals: 1) Steve's and my 'split into discrete sections' proposal, where we split core into self-sustaining sections with a common core listed as a dependency, tying installation of all together with a Bundle or similar. 2) Sendu's 'break everything up' approach where all modules are submitted independently to CPAN, with their own tests, dependencies, etc. There are advantages and disadvantages to both approaches. Not sure if CPAN would go for the latter (it's pretty drastic), but I don't know for sure. If you want in on that discussion (in this thread) feel free to join in! The more the merrier! >> [...] >> It would probably mean moving Bio::Graphics, Bio::DB::GFF and >> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages >> so they could be released more regularly on par with Gbrowse >> schedules. > > Possibly. I'm not fully sure why those modules couldn't also be > released more often out of the "main trunk" of modules. In Java/ant, > it'd be relatively easy to write build script filters that select the > appropriate modules and package them on the fly. I'm not sure whether > the build tools for Perl can do that too, though. Both approaches above would probably use Module::Build to install other bioperl dependencies, each of which could have it's own dependency set, possibly using a Bundle to tie everything together. >> Also I think someone needs to figure out Bio::Tools::GFF >> vs Bio::FeatureIO -- what do we want to do? > > I believe FeatureIO has the ontology download tied into it? > > -hilmar From recent posts here and on the gbrowse mail list by Scott and Lincoln, it seemed like they were moving away from using Bio::DB::GFF and were trying to get users to switch to Bio::DB::SeqFeature. Maybe should get a more direct response? chris From hartzell at alerce.com Thu Jun 28 09:16:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:16:18 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.46242.942184.758493@almost.alerce.com> Sendu Bala writes: > George Hartzell wrote: > > Chris Fields writes: > > > [...] > > > It looks like George Hartzell may be taking a crack at it, with > > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > > could have something testable relatively soon. After that we'll need > > > to work out a few other issues, basically what's on Hilmar's list. > > > > There's a repository on file:///home/hartzell/bioperl with all of the > > components projects in place. > > > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, that only works if you're actually on the machine. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? It looks like you tried to check out the *entire* repository. It never occured to me to try that. I'll take a look at what you reported. g. From bix at sendu.me.uk Thu Jun 28 09:20:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:20:19 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.46242.942184.758493@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> Message-ID: <4683B593.3050108@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: >> I just tried: >> >> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl [snip] > It looks like you tried to check out the *entire* repository. Yes. If you don't want everything, how does one 'browse' the repository to find out the address of the thing you /do/ want? > It never occured to me to try that. I'll take a look at what you > reported. Cheers. From bix at sendu.me.uk Thu Jun 28 09:27:29 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:27:29 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <4683B741.5020600@sendu.me.uk> George Hartzell wrote: > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? It would be great to have the following files svn:ignored : In all package roots: ? Build ? MANIFEST ? MANIFEST.SKIP ? META.yml ? _build ? bioperl-*.tar.bz2 ? bioperl-*.tar.gz ? bioperl-*.zip ? blib ? cover_db In any and all directories: ? .DS_Store ? .DAV In bioperl-live: ? t/BioDBSeqFeature.t ? t/BioDBSeqFeature_BDB.t ? t/BioDBSeqFeature_mysql.t Can't think of anything else right now. Thanks for your efforts, Sendu. From cjfields at uiuc.edu Thu Jun 28 09:30:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:30:43 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote: >> ... >> file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, it's just a tester. >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/trunk /mybiodir' to check out the main trunk for core. chris From hartzell at alerce.com Thu Jun 28 09:57:00 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:57:00 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.48684.996884.134046@almost.alerce.com> Sendu Bala writes: > [...] > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? So, you probably wanted something like svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk to pick up the head of the bioperl live tree (or /.../bioperl-run/trunk, etc...). I just checked out svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ and it ran to completion and gave me (delicious)[6:50am]~/tmp>>ls bioperl | cat biodata bioperl-cookbook bioperl-corba-client bioperl-corba-server bioperl-das-client bioperl-db bioperl-ext bioperl-gui bioperl-live bioperl-microarray bioperl-network bioperl-papers bioperl-pedigree bioperl-pipeline bioperl-run biosql-schema html task-manager xml-html Can another mac os x user out there give the Great Big Checkout a try and see if it runs to completion. Potential problems that come to mind are: - the "mac's are case insensitive, sort of" problem - you filled up your disk - something else. g. From charles-listes+bioperl at plessy.org Thu Jun 28 09:44:56 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 22:44:56 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <46837687.7010101@sheffield.ac.uk> References: <20070628074004.GD6338@kunpuu.plessy.org> <46837687.7010101@sheffield.ac.uk> Message-ID: <20070628134456.GB14492@kunpuu.plessy.org> Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit : > > Version 1.5.* is the developer release, while the 1.4.* is the stable > release. However, there have been few updates to the 1.4.* release which > means that it is more unstable than the 1.5.* dev release. I think the > consensus, was to have more rapid release cycles of the stable branch in > future in order to avoid this. I'm sure there are others more qualified > to expand/correct me on this if needs e. Ok, thank you all for the answers. I think that I will simply upgrade bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core when I will package other components. Have a nice day, -- Charles Plessy Debian-Med packaging team Wako, Saitama, Japan From bix at sendu.me.uk Thu Jun 28 10:19:49 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 15:19:49 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.48684.996884.134046@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> Message-ID: <4683C385.3050904@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: > > [...] > > I just tried: > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > > > on Mac OS X and things seemed to go well, except for this error message > > at the end: > > > > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > > svn: Can't move source to dest > > svn: Can't move > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > > to > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > > No such file or directory > > > > I also ended up with only: > > bioperl-corba-server bioperl-db bioperl-live > > bioperl-network bioperl-papers biosql-schema I tried again in the same location and it told me I had to 'svn cleanup', which I did. But subsequently it kept complaining about files already being there. > I just checked out > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ > > and it ran to completion [snip] > Can another mac os x user out there give the Great Big Checkout a try > and see if it runs to completion. Potential problems that come to > mind are: > > - the "mac's are case insensitive, sort of" problem > - you filled up your disk > - something else. Well, I didn't run out of disc space. After a rm -fr * and trying again it failed at exactly the same point, in the same way. svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data causes this repeatable problem: [...] A data/phredfile.phd svn: In directory 'data' svn: Can't move source to dest svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory That is with Mac OS X svn command-line client, version 1.4.4 I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with a linux svn command-line client, version 1.2.3. Cheers, Sendu. From dmessina at wustl.edu Thu Jun 28 11:08:59 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:08:59 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.44281.831316.749586@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: > [George] > Likewise, you probably DON'T want to use this in your config file: > > enable-auto-props = yes > * = svn:keywords="Author Date Id Rev URL" > > since it'll do the same thing. Ah, so I've been doing it wrong all along then. :) Thanks, George! > The Right Thing To Do is a more tedious > > *.pl = svn:keywords="Author Date Id Rev URL" > *.pm = svn:keywords="Author Date Id Rev URL" > *.c = svn:keywords="Author Date Id Rev URL" > > A bit of googling will give you a good starting point for the list, > and we should probably maintain a common one somewhere in the repo. I've googled around and gathered the following as a possible list for our repo. Since I obviously don't know what I'm doing :), of course adjust and refine as necessary. Dave ------- [auto-props] # Code formats *.c = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cpp = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.h = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.java = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.as = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cgi = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn-mine-type=text/plain *.js = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/javascript *.php = svn:eol-style=native; svn:keywords="Author Date Id Rev URL" Rev Date; svn:mime-type=text/x-php *.pl = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl; svn:executable *.pm = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl *.py = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-python; svn:executable *.sh = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-sh; svn:executable # Image formats *.bmp = svn:mime-type=image/bmp *.gif = svn:mime-type=image/gif *.ico = svn:mime-type=image/ico *.jpeg = svn:mime-type=image/jpeg *.jpg = svn:mime-type=image/jpeg *.png = svn:mime-type=image/png *.tif = svn:mime-type=image/tiff *.tiff = svn:mime-type=image/tiff # Data formats *.pdf = svn:mime-type=application/pdf *.avi = svn:mime-type=video/avi *.doc = svn:mime-type=application/msword *.eps = svn:mime-type=application/postscript *.gz = svn:mime-type=application/gzip *.mov = svn:mime-type=video/quicktime *.mp3 = svn:mime-type=audio/mpeg *.ppt = svn:mime-type=application/vnd.ms-powerpoint *.ps = svn:mime-type=application/postscript *.psd = svn:mime-type=application/photoshop *.rtf = svn:mime-type=text/rtf *.swf = svn:mime-type=application/x-shockwave-flash *.tgz = svn:mime-type=application/gzip *.wav = svn:mime-type=audio/wav *.xls = svn:mime-type=application/vnd.ms-excel *.zip = svn:mime-type=application/zip # Text formats .htaccess = svn:mime-type=text/plain *.css = svn:mime-type=text/css *.dtd = svn:mime-type=text/xml *.html = svn:mime-type=text/html *.ini = svn:mime-type=text/plain *.sql = svn:mime-type=text/x-sql *.txt = svn:mime-type=text/plain *.xhtml = svn:mime-type=text/xhtml+xml *.xml = svn:mime-type=text/xml *.xsd = svn:mime-type=text/xml *.xsl = svn:mime-type=text/xml *.xslt = svn:mime-type=text/xml *.xul = svn:mime-type=text/xul *.yml = svn:mime-type=text/plain CHANGES = svn:mime-type=text/plain COPYING = svn:mime-type=text/plain INSTALL = svn:mime-type=text/plain Makefile* = svn:mime-type=text/plain README = svn:mime-type=text/plain TODO = svn:mime-type=text/plain From dmessina at wustl.edu Thu Jun 28 11:11:23 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:11:23 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: > [Sendu] > > Yes. If you don't want everything, how does one 'browse' the > repository > to find out the address of the thing you /do/ want? svn ls file://dev.open-bio.org/home/hartzell/bioperl or svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl From n.haigh at sheffield.ac.uk Thu Jun 28 11:13:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:13:58 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: <4683D036.5060109@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > George Hartzell wrote: >> Sendu Bala writes: >>> I just tried: >>> >>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > [snip] >> It looks like you tried to check out the *entire* repository. > > Yes. If you don't want everything, how does one 'browse' the repository > to find out the address of the thing you /do/ want? > You could try: svn ls or svn ls -R to get a list of directories. > >> It never occured to me to try that. I'll take a look at what you >> reported. > > Cheers. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku akLhIszoQbRc/aVX3d/Jp7w= =mlHY -----END PGP SIGNATURE----- From cjfields at uiuc.edu Thu Jun 28 11:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:20:46 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> I can replicate the same problem (Mac OS X) with a full checkout: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory What local (mac) svn version are you using? I'm running off macports: svn --version svn, version 1.4.4 (r25188) compiled Jun 16 2007, 23:40:53 chris On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote: ... > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about > files > already being there. >> > [snip] >> Can another mac os x user out there give the Great Big Checkout a try >> and see if it runs to completion. Potential problems that come to >> mind are: >> >> - the "mac's are case insensitive, sort of" problem >> - you filled up your disk >> - something else. > > Well, I didn't run out of disc space. After a rm -fr * and trying > again > it failed at exactly the same point, in the same way. > > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ > release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or > directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine > with > a linux svn command-line client, version 1.2.3. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Jun 28 11:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:37:27 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Chris Fields wrote: >> ... > > The short and sweet version: my proposal has all the benefits of > yours, but none of the disadvantages. What's not to like? The short and sweet version: I'm more convinced after you laid out your argument in detail, which would have saved me some typing last night, BTW, thanks! ; > The other core devs need to chip in and we need to openly (candidly) discuss it some more (I've added Hilmar to this). There is also a tenable solution that allows both aspects ('cliques' and single mode) which might make everybody happy. Let's say we only want to install Bio::SeqIO::genbank. The Bio::SeqIO::genbank Build.PL would only install what was needed (as you indicated), only Bio::SeqIO::genbank-related tests would run (along with dependency test, if available), and life would go on. However, what if we wanted to install everything in SeqIO/DB/AlignIO/ etc? We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO modules installed or a select few (maybe a quick 'install all (y/n)?' followed by a list, which installs them one at a time along with dependencies), or have the option to specifically denote them as passed args to SeqIO's Build.PL, something like 'perl Build.PL - install-plugins genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a specific module (Bio::SeqIO::genbank) is installed directly then maybe the installation q&a's of followed modules could be bypassed when installing down the dependency tree with additional passed args. This would, in effect, be a bioperl-specific mini-CPAN within CPAN. Nice! Now, this doesn't address several related issues, such as how we handle versioning of the independent modules (should be in a controlled manner), what we do about deprecated modules which linger about on CPAN, how we deal with PPMs/RPMs/packaging, and so on. All have possible reasonable ways they can be addressed, I believe. Also, I think we should still think about doing regular full-scale 'stable' (1.#) releases (sort of our stamp of approval for that batch of modules at that point in time, with a reasonable 'sell-by' date). Again, it should be seriously discussed among the core devs and the bioperl community at large prior to any serious work on it, and it would be quite a large-scale project, but possibly worth it. It can only go forward if there is enough momentum behind it. >> Finally, all of this should wait until later. Much later, like >> after a decent release, after svn, etc kind of 'later'. I think >> we can agree on that. > > Hmm, not really. If it can be implemented by a change in just > Build.PL and ModuleBuildBioperl, its really independent of > everything else. That's the beauty of it: the only thing that > changes is how things are uploaded to and downloaded from CPAN. The > only person that normally deals with that issue is the pumpkin for > a release, and he only cares about it at release time. > > In fact, if we're going to do it at all it makes sense to try it > out on a minor release like 1.5.3. We've already got experience of > doing it split-style from 1.5.2. (And let me tell you: splits at > the code-base level suck.) BOSC is coming up, and I would like to focus on getting svn migration taken care of ASAP (which is sounding more and more like we plan on moving all open-bio over, unless I misread Jason's post?) and stomping of bugs (my next priority after EUtilities). Maybe in the interim we should try focusing on bug squashing, get out a quick standard dev release (1.5.3) before BOSC, and then a few of us could all communicate there via email/text/IM/phone off-list? Maybe post updates via the bioperl blog and list? > And where is the harm in letting them do it via CPAN as well? In > fact, there are significant benefits: ... I'm already pretty convinced... > The same can be achieved with CPAN bundles for each kind of > functional grouping you can think of. And since its just a single > text file that defines such a grouping, its easy to change or add > new ones as you feel like it, as opposed to the rather more > permanent and substantial effort of creating one of your splits on > the code-base level. ... or it could be run right in Module::Build for specific parent classes (as I mention above). Bundling could be instituted for something like a standard GBrowse release (Bundle::BioPerl::GBrowse) where the functionality might be more spread out (Bio::DB*, Bio::Graphics, Bio::FeatureIO, etc). For a full-scale old-style core install, another Bundle (Bundle::BioPerl::Standard). ... > Yes, it would be automated, and no, it wouldn't at all be any kind > of additional headache. I'm proposing a fully-automated system that > the pumpkin wouldn't even have to think about it. Much /less/ of a > headache than dealing with splits. Orders of magnitude easier to > deal with. The 'headache' would be the initial setup (splitting test, individual Build.PL, etc), but this could be done stepwise or section-wise, I suppose. ... > And the smallest, most concentrated set of modules is the > individual module. Well, only if it runs correctly (i.e. has the entire dep. tree installed). But the 'follow' tests would handle that. > The reason some of these existing splits (micoarray, ext) have > fallen by the way-side? /Because/ they're splits. If they had been > part of bioperl-live all along, they'd have been kept in a working, > compatible state and would have been released along with everything > else in 1.5.2 microarray fell out of favor for other reasons (much faster ways to do the same thing via R), though I think it still could be salvaged if someone wanted to take it up. the other bioperl distros (network, db, run, etc) would also necessitate following the same path as core, but I guess they could be bundled as well. > ... > No headaches. I already have one, sorry! chris From n.haigh at sheffield.ac.uk Thu Jun 28 11:53:52 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:53:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683D990.8090909@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> ... >> >> The short and sweet version: my proposal has all the benefits of >> yours, but none of the disadvantages. What's not to like? > > The short and sweet version: I'm more convinced after you laid out your > argument in detail, which would have saved me some typing last night, > BTW, thanks! ; > > > The other core devs need to chip in and we need to openly (candidly) > discuss it some more (I've added Hilmar to this). There is also a > tenable solution that allows both aspects ('cliques' and single mode) > which might make everybody happy. Couldn't "cliques" simply be satisfied with CPAN Bundles? > > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? I think this might be where Bundles come in for installing these "cliques" of related modules? - -- snip -- > >> Yes, it would be automated, and no, it wouldn't at all be any kind of >> additional headache. I'm proposing a fully-automated system that the >> pumpkin wouldn't even have to think about it. Much /less/ of a >> headache than dealing with splits. Orders of magnitude easier to deal >> with. > > The 'headache' would be the initial setup (splitting test, individual > Build.PL, etc), but this could be done stepwise or section-wise, I suppose. Yes, I think this is where most of the labour will be. However, setting the test suite up like this would be beneficial with or without publishing modules individually. - -- snip -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg UKE/Q/wA3gu1Gb7S6rarCQw= =WQdY -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 12:03:54 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 17:03:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683DBEA.90005@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? > > We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO > modules installed or a select few (maybe a quick 'install all (y/n)?' > followed by a list, which installs them one at a time along with > dependencies), or have the option to specifically denote them as passed > args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins > genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a > specific module (Bio::SeqIO::genbank) is installed directly then maybe > the installation q&a's of followed modules could be bypassed when > installing down the dependency tree with additional passed args. I'd probably stay away from something like this. My primary reason being, off-the-top-of-my-head I don't see how to get it to work. If you're installing Bio::SeqIO for the first time via CPAN you can't ask it to install Bio::SeqIO::genbank et al. at the same time because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity. I also wouldn't want these things to be complicated. There should be little in the way of questions to ask during install. Each module's Build.PL should be ultra-simple with no advanced logic at all. It should just specify things that are absolute requirements. This simplicity helps avoid some of the problems we face by distributing the monolithic Bioperl. No, much better for us and for users to provide a Bundle::Bio-SeqIO. > Now, this doesn't address several related issues, such as how we handle > versioning of the independent modules (should be in a controlled > manner), When a module is changed, it gets a version bump. Nothing complicated needs to be done. Transparent and obvious, behaving like all other CPAN modules would be my choice. > what we do about deprecated modules which linger about on CPAN, Delete them from CPAN seems appropriate. > how we deal with PPMs/RPMs/packaging, and so on. All have possible > reasonable ways they can be addressed, I believe. Also, I think we > should still think about doing regular full-scale 'stable' (1.#) > releases (sort of our stamp of approval for that batch of modules at > that point in time, with a reasonable 'sell-by' date). Yes, we can still choose to take a snapshot and announce it to the world, but at the module-level nothing special would happen. There would just be an updated Bundle::Bioperl-everything (or whatever). > Again, it should be seriously discussed among the core devs and the > bioperl community at large prior to any serious work on it, and it would > be quite a large-scale project, but possibly worth it. It can only go > forward if there is enough momentum behind it. The requirement for this approach is per-module test scripts. Which as I identified already, is very desirable anyway so we can hit 100% test coverage. So, regardless of anything else can we all agree that per-module test scripts are a good idea and should be worked on? If so, I'll look into the feasibility and figure out how much work will be involved. From cjfields at uiuc.edu Thu Jun 28 13:17:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 12:17:50 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683DBEA.90005@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > ... > I'd probably stay away from something like this. My primary reason > being, off-the-top-of-my-head I don't see how to get it to work. If > you're installing Bio::SeqIO for the first time via CPAN you can't > ask it to install Bio::SeqIO::genbank et al. at the same time > because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some > circularity. True... > I also wouldn't want these things to be complicated. There should > be little in the way of questions to ask during install. Each > module's Build.PL should be ultra-simple with no advanced logic at > all. It should just specify things that are absolute requirements. > This simplicity helps avoid some of the problems we face by > distributing the monolithic Bioperl. > > No, much better for us and for users to provide a Bundle::Bio-SeqIO. I just don't want too much Bundle-itis as it'll gets confusing for newbie (i.e. Vista-itis, or AdobeCS-itis). It should be limited to functional grouping (SeqIO, AlignIO, DB, etc), 'install everything', or distribution-specific (GBrowse). I also think (though Hilmar may veto this) that we should work on integrating bioperl-db, network, etc. into this if it goes forward. Here's a question: how do we plan on handling uploading bioperl updates to CPAN via PAUSE? Do we want to run every single module through one pumpkin? Or do we want to have a core dev group PAUSE account? I can see, for instance, removing everything EUtilities- related and submitting it independently using my own PAUSE account, but it would be nice to have it under an umbrella 'bioperl-devs' account instead. >> Now, this doesn't address several related issues, such as how we >> handle versioning of the independent modules (should be in a >> controlled manner), > > When a module is changed, it gets a version bump. Nothing > complicated needs to be done. Transparent and obvious, behaving > like all other CPAN modules would be my choice. > >> what we do about deprecated modules which linger about on CPAN, > > Delete them from CPAN seems appropriate. I know you can do that via PAUSE, but I think it lingers about on search.cpan.org (unless that's been fixed). This would prob. have to be used sparingly. >> how we deal with PPMs/RPMs/packaging, and so on. All have >> possible reasonable ways they can be addressed, I believe. Also, >> I think we should still think about doing regular full-scale >> 'stable' (1.#) releases (sort of our stamp of approval for that >> batch of modules at that point in time, with a reasonable 'sell- >> by' date). > > Yes, we can still choose to take a snapshot and announce it to the > world, but at the module-level nothing special would happen. There > would just be an updated Bundle::Bioperl-everything (or whatever). Right, it would basically be a stamp of certification. >> Again, it should be seriously discussed among the core devs and >> the bioperl community at large prior to any serious work on it, >> and it would be quite a large-scale project, but possibly worth >> it. It can only go forward if there is enough momentum behind it. > > The requirement for this approach is per-module test scripts. Which > as I identified already, is very desirable anyway so we can hit > 100% test coverage. > > So, regardless of anything else can we all agree that per-module > test scripts are a good idea and should be worked on? If so, I'll > look into the feasibility and figure out how much work will be > involved. I think so, but the feasibility issue is critical. Do we want cvs/ svn to be divided up into 900 subdirectories (one for each module), or do we want to have a similar directory structure as we have now, but with each module in it's own directory? Or leave everything as is and generate Build.PL on-the-fly (prob. least feasible)? This is where it might be wise to do it piece-meal at first (maybe starting with something somewhat segregated like Bio::Tools), then progress from there. chris From hartzell at alerce.com Thu Jun 28 13:38:48 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 13:38:48 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: <18051.61992.627473.323346@almost.alerce.com> David Messina writes: > > [George] > > Likewise, you probably DON'T want to use this in your config file: > > > > enable-auto-props = yes > > * = svn:keywords="Author Date Id Rev URL" > > > > since it'll do the same thing. > > Ah, so I've been doing it wrong all along then. :) Thanks, George! It's not *wrong* if it's never done anything to you that you've regretted. The right answer depends on your situation.... > [...] > I've googled around and gathered the following as a possible list for > our repo. Since I obviously don't know what I'm doing :), of course > adjust and refine as necessary. > That's a great starting point. Do you have write access to the wiki? Could you link it off of the instructions for using svn? g. From hartzell at alerce.com Thu Jun 28 14:06:50 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 14:06:50 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <18051.63674.685297.426813@almost.alerce.com> Sendu Bala writes: > [...] > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about files > already being there. You need to do the cleanup because svn exited gracelessly and you needed to help it get back in it's feet. The cleanup doesn't remove the stuff that you did get checked out, so it's still there getting in the way of your new checkout. > [...] > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with > a linux svn command-line client, version 1.2.3. I'm not 100% sure what's going on here, but I'm inclined to say "get a real computer" (and yes, I'm typing this on a mac...). I have a mac pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony the tiger used to say).... I think that we're having trouble with case sensitivity. My only evidence is that I can see where there have been both HUMBETGLOA.FASTA and HUMBETGLOA.fasta in the tree at various times. I can't figure out anything else that's weird about that file. On the other hand, I can't see how this would cause the error you're seeing though. The experiment would be to grab a usb or firewire disk (or even a memory stick), partition/format it as case sensitive (or even *unix*) and try to do svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data into it. If it works, voila. If not, I'll keep making stuff up, err, thinking about it. g. From dmessina at wustl.edu Thu Jun 28 14:15:32 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:15:32 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu> Same svn error here on the full checkout. > What local (mac) svn version are you using? I'm running off macports: > > svn --version > svn, version 1.4.4 (r25188) > compiled Jun 16 2007, 23:40:53 I have svn 1.4.3. % svn --version svn, version 1.4.3 (r23084) compiled Apr 1 2007, 02:47:14 Copyright (C) 2000-2006 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ This product includes software developed by CollabNet (http:// www.Collab.Net/). The following repository access (RA) modules are available: * ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' scheme * ra_svn : Module for accessing a repository using the svn network protocol. - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme From cjfields at uiuc.edu Thu Jun 28 14:54:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 13:54:15 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.63674.685297.426813@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > ... > I'm not 100% sure what's going on here, but I'm inclined to say "get a > real computer" (and yes, I'm typing this on a mac...). I have a mac > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > the tiger used to say).... Ouch! Though it could be worse (**coughwindowscough**). > I think that we're having trouble with case sensitivity. My only > evidence is that I can see where there have been both HUMBETGLOA.FASTA > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > anything else that's weird about that file. On the other hand, I > can't see how this would cause the error you're seeing though. Odd that other branches (including the main trunk) work but that one doesn't. > The experiment would be to grab a usb or firewire disk (or even a > memory stick), partition/format it as case sensitive (or even *unix*) > and try to do > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > > into it. If it works, voila. If not, I'll keep making stuff up, err, > thinking about it. > > g. I'll have to figure out why I can't get ssh keys to work locally to test it out more (I have a usb drive to test with); just don't have time at the moment. chris From dmessina at wustl.edu Thu Jun 28 14:47:04 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:47:04 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu> > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? Done. http://www.bioperl.org/wiki/Svn_auto-props linked from: http://www.bioperl.org/wiki/Using_Subversion (bottom of page) From bix at sendu.me.uk Thu Jun 28 15:19:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 20:19:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> Message-ID: <468409C7.7020102@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > Here's a question: how do we plan on handling uploading bioperl > updates to CPAN via PAUSE? Do we want to run every single module > through one pumpkin? Or do we want to have a core dev group PAUSE > account? I can see, for instance, removing everything EUtilities- > related and submitting it independently using my own PAUSE account, > but it would be nice to have it under an umbrella 'bioperl-devs' > account instead. All Bioperl modules (except the Bundle!) are owned by BIOPERLML on PAUSE. Its a little akward since PAUSE is uploader-centric, but see my notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release And certainly, everything that wants to consider itself part of Bioperl (and gain the benefit of lots of devs looking after it) should certainly have BIOPERLML as the primary owner. > I think so, but the feasibility issue is critical. Do we want cvs/ > svn to be divided up into 900 subdirectories (one for each module), > or do we want to have a similar directory structure as we have now, > but with each module in it's own directory? Or leave everything as > is and generate Build.PL on-the-fly (prob. least feasible)? Very definitely the latter. The key benefit of my approach is that the organisation stays as is and that a snapshot of the repository remains a single directory of modules in Bio so that people don't have to 'install' Bioperl, they can still just uncompress the archive (or check out the package from svn) and point their PERL5LIB to the root dir of the package. For that reason I very much like the idea of folding the current split-out packages (run, network etc.) back into the core package so everything is one place. Folding them back in should obviously wait until everything is in place and working with core already. My proposal obviously wasn't very clear. As far as all other devs are concerned, nothing changes at all (except for lots of new improved test scripts). The pumpkin will, however, be able to say: ./Build dist Right now that generates the distribution archives (in different compression formats) - one big archive containing everything. My proposal is simply that instead it generates lots of archives, one archive per module. It will also generate some Bundles and whatever else might be needed. I don't envisage any major difficulties in achieving this. The 'feasibility' issue I was going to look into was strictly regarding doing all the new test scripts. From hartzell at alerce.com Thu Jun 28 15:43:38 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 15:43:38 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: <18052.3946.224905.415905@almost.alerce.com> Chris Fields writes: > > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > > > ... > > I'm not 100% sure what's going on here, but I'm inclined to say "get a > > real computer" (and yes, I'm typing this on a mac...). I have a mac > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > > the tiger used to say).... > > Ouch! Though it could be worse (**coughwindowscough**). > > > I think that we're having trouble with case sensitivity. My only > > evidence is that I can see where there have been both HUMBETGLOA.FASTA > > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > > anything else that's weird about that file. On the other hand, I > > can't see how this would cause the error you're seeing though. > > Odd that other branches (including the main trunk) work but that one > doesn't. > > > The experiment would be to grab a usb or firewire disk (or even a > > memory stick), partition/format it as case sensitive (or even *unix*) > > and try to do > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > > live/tags/release-0-9-2/t/data > > > > into it. If it works, voila. If not, I'll keep making stuff up, err, > > thinking about it. > > > > g. > > I'll have to figure out why I can't get ssh keys to work locally to > test it out more (I have a usb drive to test with); just don't have > time at the moment. I just did the experiment, and filename-insensitivity seems to be breaking something. I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. I reformatted a memory stick to be case sensitive and co of bioperl/bioperl-live/tags/release-0-9-2/t worked, then I made a directory in my home dir (normal mac thing) and got the same error as above. I can get a copy of the trunk, so I'm inclined to ask someone to mention the problem on the wiki and then just ignore it. g. From cjfields at uiuc.edu Thu Jun 28 16:29:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 15:29:09 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu> On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: >> Here's a question: how do we plan on handling uploading bioperl >> updates to CPAN via PAUSE? Do we want to run every single module >> through one pumpkin? Or do we want to have a core dev group PAUSE >> account? I can see, for instance, removing everything EUtilities- >> related and submitting it independently using my own PAUSE account, >> but it would be nice to have it under an umbrella 'bioperl-devs' >> account instead. > > All Bioperl modules (except the Bundle!) are owned by BIOPERLML on > PAUSE. Its a little akward since PAUSE is uploader-centric, but see my > notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release > > And certainly, everything that wants to consider itself part of > Bioperl > (and gain the benefit of lots of devs looking after it) should > certainly > have BIOPERLML as the primary owner. Alrighty then. >> I think so, but the feasibility issue is critical. Do we want cvs/ >> svn to be divided up into 900 subdirectories (one for each module), >> or do we want to have a similar directory structure as we have now, >> but with each module in it's own directory? Or leave everything as >> is and generate Build.PL on-the-fly (prob. least feasible)? > > Very definitely the latter. The key benefit of my approach is that the > organisation stays as is and that a snapshot of the repository > remains a > single directory of modules in Bio so that people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Okay, makes sense. > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. I agree, but that's up to Brian, Hilmar, and the others who donated the packages (or at least a consensus of core devs). One thing at a time. > My proposal obviously wasn't very clear. As far as all other devs are > concerned, nothing changes at all (except for lots of new improved > test > scripts). The pumpkin will, however, be able to say: > > ./Build dist > > Right now that generates the distribution archives (in different > compression formats) - one big archive containing everything. > My proposal is simply that instead it generates lots of archives, one > archive per module. It will also generate some Bundles and whatever > else > might be needed. We'll need to define which tests and data goes with each module and so on. > I don't envisage any major difficulties in achieving this. The > 'feasibility' issue I was going to look into was strictly regarding > doing all the new test scripts. Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 is ready to go. We'll still need to get thoughts on this from other core devs out there, and it prob. should until everybody is comfortable with the idea. chris From dmessina at wustl.edu Thu Jun 28 18:13:48 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 17:13:48 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: Coming late to this party, I'm replying to snippets from multiple emails. > [Chris] > what we do about deprecated modules which linger > about on CPAN > [Sendu] > Delete them from CPAN seems appropriate. I coulda sworn this was frowned upon, but a recent thread suggests it's totally kosher. http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html > [Sendu] > So, regardless of anything else can we all agree that per-module test > scripts are a good idea and should be worked on? I agree. > [Sendu] > people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Could you elaborate a bit on how this works? How is XS code that needs compiling handled? Or the scripts directory? I would love to be able to do this. > [Sendu] > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. From an organizational standpoint, I'm concerned that with ~900 modules in core right now, adding all of the additional stuff from the split-out packages would make for a daunting directory. But as you said, this is way down the road, so this proposal doesn't bear on the other, closer-to-now issues on the table. > [Chris] > Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 > is ready to go. We'll still need to get thoughts on this from other > core devs out there, and it prob. should until everybody is > comfortable with the idea. If we go forward with the CPAN split plan, I like the idea of having a trial. We can foresee some of the issues that such a change may bring, and yet still more no doubt wait for us once we do it. Dave From bix at sendu.me.uk Thu Jun 28 18:59:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 23:59:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46843D57.2080409@sendu.me.uk> David Messina wrote: >> people don't have to 'install' Bioperl, they can still just >> uncompress the archive (or check out the package from svn) and >> point their PERL5LIB to the root dir of the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to be > able to do this. I meant for the most part. Core doesn't have any XS code so that's not an issue. Scripts can be run manually like any other perl script. When you discover something isn't working because of a missing external dependency, you just install it. (But that happens very rarely.) Personally I've /never/ installed Bioperl and used that installed set of modules. I've always just pointed my PERL5LIB at the distribution folder or my cvs checkout. Which makes me a strange candidate for advocating all these CPAN-specific changes, but there you go ;) From cjfields at uiuc.edu Thu Jun 28 19:03:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 18:03:02 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu> On Jun 28, 2007, at 5:13 PM, David Messina wrote: > Coming late to this party, I'm replying to snippets from multiple > emails. > > >> [Chris] >> what we do about deprecated modules which linger >> about on CPAN > >> [Sendu] >> Delete them from CPAN seems appropriate. > > I coulda sworn this was frowned upon, but a recent thread suggests > it's totally kosher. > > http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html As long as it doesn't show up somewhere to confuse newbies I'm okay with it. >> [Sendu] >> people don't have to >> 'install' Bioperl, they can still just uncompress the archive (or >> check >> out the package from svn) and point their PERL5LIB to the root dir of >> the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to > be able to do this. Maybe Sendu can add to this, but the XS code is limited to bioperl- ext AFAIK. We could keep that separate until it plays well with bioperl itself. Scripts and examples - maybe packaged along with a Bundle? >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal > doesn't bear on the other, closer-to-now issues on the table. Well, the code in bioperl-db and network complement code in core, so I agree with Sendu they belong there. They should be under the same scrutiny as the rest anyway (code, tests, etc), but won't be bundled unles there is an 'install everything' Bundle. >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of > having a trial. We can foresee some of the issues that such a > change may bring, and yet still more no doubt wait for us once we > do it. That's what branches are for; testing stuff out like this. chris From hartzell at alerce.com Thu Jun 28 19:05:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 19:05:32 -0400 Subject: [Bioperl-l] problem with binary files. Message-ID: <18052.16060.932502.183552@almost.alerce.com> Ok, after pointing out the problem with setting the svn:keywords property on binary files, it turns out that I *did* that. Worse yet, I set the svn:eol-style to 'native' on everything, including binary files, so depending on your platform they're likely to be fubar. For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may not be what you expect it to be, depending on whether your eol-style matches the servers and whether any conversions were done. I'll touch up the way that the little tool I'm using calls cvs2svn and redo the repository. g. From n.haigh at sheffield.ac.uk Fri Jun 29 02:59:21 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 07:59:21 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4684ADC9.8040404@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- split -- >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal doesn't > bear on the other, closer-to-now issues on the table. > I don't think this is an issue - it would simply mean everything is under the same version control hierarchy. And with svn it's Soooooo much easier to fiddle around with directory structures > > >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of having > a trial. We can foresee some of the issues that such a change may > bring, and yet still more no doubt wait for us once we do it. > Under svn it would be easy to make an "svn copy" of run, network etc into a branch of live to test this out. Not that this might be a problem, but: Since we are looking at bioperl-* packages being under the same svn repository, then then "svn copy's" are cheap for disk space. > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6 BCvltmPyWF4ImueYmd7VFAc= =ktl+ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Fri Jun 29 03:05:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 08:05:33 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <4684AF3D.5090907@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: - -- snip -- > > [...] > > I've googled around and gathered the following as a possible list for > > our repo. Since I obviously don't know what I'm doing :), of course > > adjust and refine as necessary. > > > > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? > > g. Don't .t files need adding to the auto-props? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC /Iivb6Lc4/51bUdrTmRQYlE= =V+t2 -----END PGP SIGNATURE----- From sac at bioperl.org Fri Jun 29 04:25:36 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 01:25:36 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> On 6/27/07, Chris Fields wrote: > > On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > > > ... > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > > > or > > > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around with it. Are you using the ssh that comes installed with OSX? If so, I'd recommend installing openssh from MacPorts. I recall having issues with the stock version which were resolved by using the more up-to-date version you can get via MacPorts. BTW, I haven't been able to check out the new svn repository via svn+ssh:// because I can't get svn to authenticate with an alternative username. My username on dev.open-bio.org differs from what it is on my local machine, so I issue a command such as: steve at localhost $ svn --username sac checkout svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk but I get challenged with: steve at dev.open-bio.org's password: I also tried putting the --username argument after the subcommand, but it still wants to use my local username. I can ssh -l sac into the dev box no problem. Any suggestions? Steve From bix at sendu.me.uk Fri Jun 29 04:52:42 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 29 Jun 2007 09:52:42 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <4684C85A.5030206@sendu.me.uk> Steve Chervitz wrote: > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? Set up your ssh key on the dev machine. I'm also on a machine with the wrong username and it works even without attempting to supply the correct one. It does, however, show the 'Welcome to the new developer system' message 2 or 3 times for every svn+ssh action, which freaks me out a little. From N.Haigh at sheffield.ac.uk Fri Jun 29 05:32:38 2007 From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 10:32:38 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Quoting Steve Chervitz : -- snip -- > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > You could try: svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Nath From dmessina at wustl.edu Fri Jun 29 08:28:26 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 07:28:26 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> > > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. I have the same issue. I set up a stanza in my ~/.ssh/config: Host dev.open-bio.org User dave_messina where dave_messina is my dev.open-bio.org username. From cjfields at uiuc.edu Fri Jun 29 13:00:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 29 Jun 2007 12:00:27 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >> BTW, I haven't been able to check out the new svn repository via >> svn+ssh:// because I can't get svn to authenticate with an >> alternative >> username. > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > Host dev.open-bio.org > User dave_messina > > where dave_messina is my dev.open-bio.org username. I changed to the macports ssh w/o luck. It appears the key is offered up, so maybe the problem is how I have everything set up on dev (though I followed everything on the wiki): .... Contact 'support at open-bio.org' for your new login information. ====================================== debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug1: Next authentication method: publickey debug1: Offering public key: /Users/cjfields/.ssh/id_dsa debug2: we sent a publickey packet, wait for reply debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug2: we did not send a packet, disable method debug1: Next authentication method: password It's odd; I can use passwordless logins for other servers (admittedly Mac servers) w/o problems using ssh keys, but dev.open-bio.org always prompts for a password regardless. My feeling is it's something with my local ssh or sshd config; I'll try fiddling with it to see what happens. Anyone have suggestions? I've lost enough hair as is; don't want to lose more! chris From sac at bioperl.org Fri Jun 29 13:07:45 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 10:07:45 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com> On 6/29/07, Nathan S. Haigh wrote: > Quoting Steve Chervitz : > > -- snip -- > > > BTW, I haven't been able to check out the new svn repository via > > svn+ssh:// because I can't get svn to authenticate with an alternative > > username. My username on dev.open-bio.org differs from what it is on > > my local machine, so I issue a command such as: > > > > steve at localhost $ svn --username sac checkout > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > > > but I get challenged with: > > steve at dev.open-bio.org's password: > > > > I also tried putting the --username argument after the subcommand, but > > it still wants to use my local username. I can ssh -l sac into the dev > > box no problem. Any suggestions? > > [...] > You could try: > svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Bingo. Thanks for the tips, guys. BTW, setting up ssh keys was not the issue, since my key is already set up on the dev machine. The svn --username setting appears to not be operative at the ssh layer. I suspected this might be the case given that the usage info says: $ svn --help co --username arg : specify a username ARG --password arg : specify a password ARG which seemed insecure. I didn't want to send my password in the clear, and didn't know if or whether svn would hand it off to ssh. It wasn't even sending my username to ssh, so I knew something was wrong. These args are probably only intended for accessing local svn repositories, or non-svn+ssh-based checkouts. BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and openssh installed via MacPorts: $ svn --version svn, version 1.4.4 (r25188) compiled Jun 28 2007, 23:51:53 $ ssh -version OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007 Steve From hartzell at alerce.com Fri Jun 29 15:19:31 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 29 Jun 2007 15:19:31 -0400 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: <18053.23363.102371.602742@almost.alerce.com> Chris Fields writes: > > On Jun 29, 2007, at 7:28 AM, David Messina wrote: > > >> > >> BTW, I haven't been able to check out the new svn repository via > >> svn+ssh:// because I can't get svn to authenticate with an > >> alternative > >> username. > > > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > > > Host dev.open-bio.org > > User dave_messina > > > > where dave_messina is my dev.open-bio.org username. > > I changed to the macports ssh w/o luck. It appears the key is > offered up, so maybe the problem is how I have everything set up on > dev (though I followed everything on the wiki): A couple of things to check. - make sure that you put your public key in ~/.ssh/authorized_keys2 (not authorized_keys) - make sure that authorized_keys2 is chmod'ed 600 (644 might be enough...). - make sure that ~/.ssh is chmoded 700. - make sure that your home directory is 755. Then see if it works. You might be able to relax some of those protections a bit, but ssh's uptight about letting other people mess with that data. g. From dmessina at wustl.edu Fri Jun 29 18:47:14 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 17:47:14 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> > [Nathan] > Don't .t files need adding to the auto-props? Yes -- thanks for reminding me. Please feel free to add it to the wiki page. I'll be tweaking it some more later on in any case. Dave From n.haigh at sheffield.ac.uk Sat Jun 30 05:55:56 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 10:55:56 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> Message-ID: <468628AC.9060200@sheffield.ac.uk> David Messina wrote: >> [Nathan] >> Don't .t files need adding to the auto-props? > > Yes -- thanks for reminding me. Please feel free to add it to the wiki > page. I'll be tweaking it some more later on in any case. > > > Dave I noticed this has already been done. I have just been through the t/data dir and added a list of extensions I found (without props). There are some files without extensions, how should these be dealt with? There seems to be a plethora of file naming styles which means there's a pretty long list of non-standard extensions. So at some point someone will commit a new data file with a new extension (often describing what program created the output or the test for which it's intended) that won't be in the auto-props file - can you think of a way around this? Nath From cjfields at uiuc.edu Sat Jun 30 08:48:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 07:48:10 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <18053.23363.102371.602742@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> <18053.23363.102371.602742@almost.alerce.com> Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu> On Jun 29, 2007, at 2:19 PM, George Hartzell wrote: > Chris Fields writes: >> >> On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >>>> >>>> BTW, I haven't been able to check out the new svn repository via >>>> svn+ssh:// because I can't get svn to authenticate with an >>>> alternative >>>> username. >>> >>> I have the same issue. I set up a stanza in my ~/.ssh/config: >>> >>> Host dev.open-bio.org >>> User dave_messina >>> >>> where dave_messina is my dev.open-bio.org username. >> >> I changed to the macports ssh w/o luck. It appears the key is >> offered up, so maybe the problem is how I have everything set up on >> dev (though I followed everything on the wiki): > > A couple of things to check. > > - make sure that you put your public key in ~/.ssh/authorized_keys2 > (not authorized_keys) > > - make sure that authorized_keys2 is chmod'ed 600 (644 might be > enough...). > > - make sure that ~/.ssh is chmoded 700. > > - make sure that your home directory is 755. > > Then see if it works. You might be able to relax some of those > protections a bit, but ssh's uptight about letting other people mess > with that data. > > g. Got it working; it was the permissions on my home dir (the last one). Thanks George! chris From dmessina at wustl.edu Sat Jun 30 11:37:44 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 10:37:44 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <468628AC.9060200@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> > I have just been through the t/data dir and added a list of > extensions I found Thanks! That's a big help. I'll add prop definitions to those shortly. > There are some files without extensions, how should these be dealt > with? If you look in the text files section, there are some files there which don't have extensions, e.g. AUTHORS, BUGS. There's also Makefile.* so we have some flexibility in how svn knows to auto-prop a file. I haven't read up on the details yet to find out how it handles files that match multiple criteria -- it may be dependent simply on the order they're defined. > There seems to be a plethora of file naming styles which means > there's a pretty long list of non-standard extensions. So at some > point someone will commit a new data file with a new extension > (often describing what program created the output or the test for > which it's intended) that won't be in the auto-props file - can you > think of a way around this? Ive been thinking about this a bit. How about this? - We have just "standard" files and extensions (like *.blast, *.fasta) in the auto-props list. - We manually add props for the files that have nonstandard, arbitrary extensions so all the files have now are prop'd. - At some point we rename those nonstandard files to have standard extensions. Especially for the t/data/ files, we'll have to make sure to update the tests that rely on them. - We can have the suggested list of extensions for new files that get added. I don't think we need to strictly enforce this just for the sake of svn (after all, its primary function of version control will work just fine without any properties set), but it would be nice if we could try to keep to it mostly. Many distros come with an /etc/mime.types file which has the list of officially registered MIME types. I found a script that will take this list and convert it into auto-props format. I don't think we need to support *all* of the gazillion filetypes since most of the them our repository will never see, but we certainly could. Dave From dmessina at wustl.edu Sat Jun 30 12:26:27 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 11:26:27 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 10:37 AM, David Messina wrote: > - We manually add props for the files that have nonstandard, > arbitrary extensions so all the files have now are prop'd. Er, that should be - We manually add props for the files that have nonstandard, arbitrary extensions so that all the files now in the repository are prop'd. From n.haigh at sheffield.ac.uk Sat Jun 30 13:25:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 18:25:58 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: <46869226.70203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- snip -- > > >> There seems to be a plethora of file naming styles which means there's >> a pretty long list of non-standard extensions. So at some point >> someone will commit a new data file with a new extension (often >> describing what program created the output or the test for which it's >> intended) that won't be in the auto-props file - can you think of a >> way around this? > > Ive been thinking about this a bit. How about this? > > - We have just "standard" files and extensions (like *.blast, *.fasta) > in the auto-props list. I think the list of seq formats recognised by Bioperl in Bio::SeqIO and Bio::AlignIO would be a good start. As these are likely to be the ones that are sensitive to file format recognition and thus could break tests if renamed. I think a lot of people have used "." in file names as an alternative to a space. I think it would be beneficial to use an underscore "_" in these cases and leave the "." to represent the beginning of the file extension. > > - We manually add props for the files that have nonstandard, arbitrary > extensions so all the files that we currently have now are prop'd. > > - At some point we rename those nonstandard files to have standard > extensions. Especially for the t/data/ files, we'll have to make sure to > update the tests that rely on them. Nice and easy with svn :) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0 pYVvXwxq0lpiGfM09RQ6A1I= =3Lhw -----END PGP SIGNATURE----- From cjfields at uiuc.edu Sat Jun 30 15:11:52 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 14:11:52 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 11:26 AM, David Messina wrote: > > On Jun 30, 2007, at 10:37 AM, David Messina wrote: > >> - We manually add props for the files that have nonstandard, >> arbitrary extensions so all the files have now are prop'd. > > Er, that should be > > - We manually add props for the files that have nonstandard, > arbitrary extensions so that all the files now in the repository are > prop'd. Do we need to define every filetype extension, or can there be a fallback (eg if it isn't on the list or has no extension it's plain text)? chris From hlapp at gmx.net Sat Jun 30 17:26:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 17:26:22 -0400 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > [...] > Very definitely the latter. The key benefit of my approach is that > the organisation stays as is and that a snapshot of the repository > remains a single directory of modules in Bio so that people don't > have to 'install' Bioperl, they can still just uncompress the > archive (or check out the package from svn) and point their > PERL5LIB to the root dir of the package. I think this is absolutely key to keep in mind. Anything without this feature will likely be a non-starter. I don't really have time to follow the discussion let alone participate, so really all I can contribute is to offer some sanity/ reality checks (such as the above). In this sense, I understand a release pumpkin will generate ~900 packages to upload to CPAN? How much hassle is that compared to what uploading a bioperl release means right now? How brittle is all the Build.PL code that will be needed to automate all of this, and how difficult will it be to maintain? For example, if someone adds in 10 new modules, what Build.PL-related work will need to be done? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Sat Jun 30 17:32:52 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 30 Jun 2007 22:32:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <4686CC04.6000403@sendu.me.uk> Hilmar Lapp wrote: > On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > >> [...] >> Very definitely the latter. The key benefit of my approach is that >> the organisation stays as is and that a snapshot of the repository >> remains a single directory of modules in Bio so that people don't >> have to 'install' Bioperl, they can still just uncompress the >> archive (or check out the package from svn) and point their >> PERL5LIB to the root dir of the package. [snip] > In this sense, I understand a release pumpkin will generate ~900 > packages to upload to CPAN? How much hassle is that compared to what > uploading a bioperl release means right now? I'd have to investigate. I did my uploads using the PAUSE website, which for 900 packages would be unfeasible. Will have to see if the process can be automated. > How brittle is all the Build.PL code that will be needed to automate > all of this, and how difficult will it be to maintain? For example, > if someone adds in 10 new modules, what Build.PL-related work will > need to be done? Well, my plan will be that once the work is done, you won't need to touch the Build.PL code again. My intent is that the pumpkin can just type one command and not think about anything. As for the reality, I won't know until I think about it properly and experiment. From hlapp at gmx.net Sat Jun 30 19:36:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 19:36:45 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18052.3946.224905.415905@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > I just did the experiment, and filename-insensitivity seems to be > breaking something. > > I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. > > I reformatted a memory stick to be case sensitive and co of > > bioperl/bioperl-live/tags/release-0-9-2/t > > worked, then I made a directory in my home dir (normal mac thing) and > got the same error as above. You picked up a rename of a file from lower case extension to upper case extension. Unfortunately, there are several months between adding the upper-case and removing the lower-case version. We can reconstruct what happened with this using svn log on the directory (this does not require a checkout): $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ bioperl-live/trunk/t/data Searching for HUMBETGLOA yields the following two commits that added one and removed the other: ------------------------------------------------------------------------ r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines Changed paths: M /bioperl-live/trunk/t/SearchIO.t A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA A /bioperl-live/trunk/t/data/cysprot1.FASTA added tests for FASTA ------------------------------------------------------------------------ r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines Changed paths: A /bioperl-live/trunk/t/data/HUMBETGLOA.fa D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta renaming file to avoid clobbering on windows Unfortunately, both files are in the tag (again, no checkout required): $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta HUMBETGLOA.FASTA HUMBETGLOA.fasta We can remove the offending version from the repository (again, without needing a checkout): $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta I did this, and now the tag checks out fine on OSX. Can anyone confirm? (BTW the ability to operate on the repository w/o needing a checkout is another advantage of svn) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 30 20:40:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 19:40:53 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: Checkout worked for me (Mac OS X) using both: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ tags/release-0-9-2/t/data svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ tags/release-0-9-2/ so removing the offending file worked (good catch!). Haven't run a full co but probably isn't necessary. chris On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > >> I just did the experiment, and filename-insensitivity seems to be >> breaking something. >> >> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. >> >> I reformatted a memory stick to be case sensitive and co of >> >> bioperl/bioperl-live/tags/release-0-9-2/t >> >> worked, then I made a directory in my home dir (normal mac thing) and >> got the same error as above. > > You picked up a rename of a file from lower case extension to upper > case extension. Unfortunately, there are several months between > adding the upper-case and removing the lower-case version. > > We can reconstruct what happened with this using svn log on the > directory (this does not require a checkout): > > $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ > bioperl/bioperl-live/trunk/t/data > > Searching for HUMBETGLOA yields the following two commits that > added one and removed the other: > > ---------------------------------------------------------------------- > -- > r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines > Changed paths: > M /bioperl-live/trunk/t/SearchIO.t > A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA > A /bioperl-live/trunk/t/data/cysprot1.FASTA > > added tests for FASTA > > ---------------------------------------------------------------------- > -- > r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines > Changed paths: > A /bioperl-live/trunk/t/data/HUMBETGLOA.fa > D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta > > renaming file to avoid clobbering on windows > > Unfortunately, both files are in the tag (again, no checkout > required): > > $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta > HUMBETGLOA.FASTA > HUMBETGLOA.fasta > > We can remove the offending version from the repository (again, > without needing a checkout): > > $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta > > I did this, and now the tag checks out fine on OSX. Can anyone > confirm? > > (BTW the ability to operate on the repository w/o needing a > checkout is another advantage of svn) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 30 20:48:06 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 30 Jun 2007 17:48:06 -0700 Subject: [Bioperl-l] Take 2 of the new subversion repository. Message-ID: <18054.63942.316904.413911@almost.alerce.com> There's a second cut at the subversion repository. I've done a better job of setting svn:keywords and svn:eol-style on various files. The defaults were more cautious and I used an auto-props files based on the wiki version. svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 The old repository's still around as svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 I renamed it so that people would work with it by mistake. If, for some hard-to-imagine reason, you have a working copy that you want to run against it, you should be able to do an svn switch --relocate on your working copy and be back in shape. In fact, it might be a good time to give it a try.... g. From hartzell at alerce.com Sat Jun 30 21:17:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 30 Jun 2007 18:17:18 -0700 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: <18055.158.30409.808612@almost.alerce.com> Chris Fields writes: > Checkout worked for me (Mac OS X) using both: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ > tags/release-0-9-2/t/data > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ > tags/release-0-9-2/ > > so removing the offending file worked (good catch!). Haven't run a > full co but probably isn't necessary. > [...] I'll keep a note of that as something to do when I prepare the final cut of the repository. g. From jason at bioperl.org Sat Jun 30 21:25:30 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 30 Jun 2007 18:25:30 -0700 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: <18054.63942.316904.413911@almost.alerce.com> References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: Thanks George - I also did chgrp -R bioperl /home/hartzell/bioperl_take? to make sure the group permission was set right. We may also want to do a chmod g+s on all the dirs in there as well so that permissions are preserved when this gets deployed for real. If anyone wants to make some changes to files and commit them, as well as make some branches/tags to play around a little bit since we'll likely throw this away and do it again from locked down version from CVS at some appointed time. Do you know how to have svn commit messages generate summary emails as well? -j On Jun 30, 2007, at 5:48 PM, George Hartzell wrote: > > There's a second cut at the subversion repository. I've done a better > job of setting svn:keywords and svn:eol-style on various files. The > defaults were more cautious and I used an auto-props files based on > the wiki version. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 > > The old repository's still around as > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 > > I renamed it so that people would work with it by mistake. If, for > some hard-to-imagine reason, you have a working copy that you want to > run against it, you should be able to do an svn switch --relocate on > your working copy and be back in shape. In fact, it might be a good > time to give it a try.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Sat Jun 30 22:21:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 22:21:25 -0400 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: <18054.63942.316904.413911@almost.alerce.com> References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: <5F53A433-BAA9-431D-A0C5-5955690D0B73@gmx.net> On Jun 30, 2007, at 8:48 PM, George Hartzell wrote: > I renamed it so that people would work with it by mistake. If, for > some hard-to-imagine reason, you have a working copy that you want to > run against it, It's not so hard to imagine - checking out the entire repository takes a long time. > you should be able to do an svn switch --relocate on > your working copy and be back in shape. In fact, it might be a good > time to give it a try.... It doesn't work: svn: The repository at 'svn+ssh://dev.open-bio.org/home/hartzell/ bioperl_take2' has uuid '31277767-6726-dc11-ab4c-0019e3f901d6', but the WC has '27e854f1-f323-dc11-8c1b-0019e3f901d6' You can't relocate to a totally new repository (relocating to bioperl_take1 does work though). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 30 22:39:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 21:39:27 -0500 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: <7C6FD6C9-CBED-40D3-BA90-4B34F79E6DE0@uiuc.edu> There are a few CPAN modules available; here's one: http://search.cpan.org/~dwheeler/SVN-Notify-2.66/lib/SVN/Notify.pm chris On Jun 30, 2007, at 8:25 PM, Jason Stajich wrote: > Thanks George - > I also did > chgrp -R bioperl /home/hartzell/bioperl_take? > to make sure the group permission was set right. > > We may also want to do a chmod g+s on all the dirs in there as well > so that permissions are preserved when this gets deployed for real. > > If anyone wants to make some changes to files and commit them, as > well as make some branches/tags to play around a little bit since > we'll likely throw this away and do it again from locked down version > from CVS at some appointed time. > > Do you know how to have svn commit messages generate summary emails > as well? > > -j > On Jun 30, 2007, at 5:48 PM, George Hartzell wrote: > >> >> There's a second cut at the subversion repository. I've done a >> better >> job of setting svn:keywords and svn:eol-style on various files. The >> defaults were more cautious and I used an auto-props files based on >> the wiki version. >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 >> >> The old repository's still around as >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 >> >> I renamed it so that people would work with it by mistake. If, for >> some hard-to-imagine reason, you have a working copy that you want to >> run against it, you should be able to do an svn switch --relocate on >> your working copy and be back in shape. In fact, it might be a good >> time to give it a try.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sat Jun 30 22:46:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 21:46:05 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4686CC04.6000403@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> <4686CC04.6000403@sendu.me.uk> Message-ID: On Jun 30, 2007, at 4:32 PM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: >>> [...] >>> Very definitely the latter. The key benefit of my approach is >>> that the organisation stays as is and that a snapshot of the >>> repository remains a single directory of modules in Bio so that >>> people don't have to 'install' Bioperl, they can still just >>> uncompress the archive (or check out the package from svn) and >>> point their PERL5LIB to the root dir of the package. > [snip] >> In this sense, I understand a release pumpkin will generate ~900 >> packages to upload to CPAN? How much hassle is that compared to >> what uploading a bioperl release means right now? > > I'd have to investigate. I did my uploads using the PAUSE website, > which for 900 packages would be unfeasible. Will have to see if the > process can be automated. Not that they would care one way or another but maybe we should contact the CPAN maintainers to get their thoughts. They might have some ideas... >> How brittle is all the Build.PL code that will be needed to >> automate all of this, and how difficult will it be to maintain? >> For example, if someone adds in 10 new modules, what Build.PL- >> related work will need to be done? > > Well, my plan will be that once the work is done, you won't need to > touch the Build.PL code again. My intent is that the pumpkin can > just type one command and not think about anything. > > As for the reality, I won't know until I think about it properly > and experiment. A good experiment for a branch. I still think this could be accomplished step-wise; for instance run a quick test using something with a simple dependency tree like Bio::Root::Root (only needs RootI), finish up with Bio::Root*, then work down into PrimarySeq, Seq, etc. Submit them to CPAN piecemeal or in batches (all Bio::Seq*, so on). If the Build.PL, etc are to be generated on the fly then maybe there should be a simple way of registering or matching tests to modules (or vice versa) to ease the pain, particularly for new code. chris From hlapp at gmx.net Sat Jun 30 22:56:04 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 22:56:04 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: It turns out that both files are also present on the release-0-9-3, bioperl-1-0-0, bioperl-1-0-alpha, and bioperl-1-0-alpha2-rc tags, so add $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/release-0-9-3/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-0/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha2-rc/t/data/ HUMBETGLOA.fasta to the post-processing commands. -hilmar On Jun 30, 2007, at 8:40 PM, Chris Fields wrote: > Checkout worked for me (Mac OS X) using both: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/ > > so removing the offending file worked (good catch!). Haven't run a > full co but probably isn't necessary. > > chris > > On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote: > >> >> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: >> >>> I just did the experiment, and filename-insensitivity seems to be >>> breaking something. >>> >>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. >>> >>> I reformatted a memory stick to be case sensitive and co of >>> >>> bioperl/bioperl-live/tags/release-0-9-2/t >>> >>> worked, then I made a directory in my home dir (normal mac thing) >>> and >>> got the same error as above. >> >> You picked up a rename of a file from lower case extension to >> upper case extension. Unfortunately, there are several months >> between adding the upper-case and removing the lower-case version. >> >> We can reconstruct what happened with this using svn log on the >> directory (this does not require a checkout): >> >> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ >> bioperl/bioperl-live/trunk/t/data >> >> Searching for HUMBETGLOA yields the following two commits that >> added one and removed the other: >> >> --------------------------------------------------------------------- >> --- >> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 >> lines >> Changed paths: >> M /bioperl-live/trunk/t/SearchIO.t >> A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA >> A /bioperl-live/trunk/t/data/cysprot1.FASTA >> >> added tests for FASTA >> >> --------------------------------------------------------------------- >> --- >> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 >> lines >> Changed paths: >> A /bioperl-live/trunk/t/data/HUMBETGLOA.fa >> D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta >> >> renaming file to avoid clobbering on windows >> >> Unfortunately, both files are in the tag (again, no checkout >> required): >> >> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ >> bioperl-live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i >> fasta >> HUMBETGLOA.FASTA >> HUMBETGLOA.fasta >> >> We can remove the offending version from the repository (again, >> without needing a checkout): >> >> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- >> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta >> >> I did this, and now the tag checks out fine on OSX. Can anyone >> confirm? >> >> (BTW the ability to operate on the repository w/o needing a >> checkout is another advantage of svn) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Fri Jun 1 04:06:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 01 Jun 2007 09:06:04 +0100 Subject: [Bioperl-l] ClustalW Score? In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><465E9B58.1020403@sendu.me.uk> <49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org> <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> Message-ID: <465FD36C.5060603@sendu.me.uk> Kevin Brown wrote: >> you're right --- it is not really my code, I was just >> elaborating Kevin's example --- it would probably need to be >> more specific or perhaps the last Score seen is sufficient >> for what one is trying to capture? > > I took that code from a pairwise clustal alignment script that I wrote > to deal with aligning a bunch of short sequences against a long one to > see where they line up at. When all of them were fed to Clustal the > short sequences all ended up aligned to each other and not well aligned > to the longer sequence. I only saw one score in the output from the > pairwise, so that is what I used to find a reasonable value. Ok, well I've hedged my bets and used both. Now commited to CVS. From jy at genseq.co.uk Fri Jun 1 22:39:48 2007 From: jy at genseq.co.uk (Jean-Yves Sireau) Date: Sat, 2 Jun 2007 10:39:48 +0800 Subject: [Bioperl-l] Genseq Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com> Dear List members, I would like to let you know of the formation of Genseq Ltd., a bioinformatics company that will (in time!) offer genome sequencing to high net worth individuals and bioinformatic analysis of the sequence data to detect predisposition to illness. The company's website is www.genseq.co.uk Genseq would be willing to sponsor bioperl, whether financially or by providing resources, notably for any bioperl-related activities in the Asia Pacific region. Genseq's bioinformatics team will be based in Cyberjaya (Malaysia), and we are in particular interested to promote bioperl in Malaysia. We are also actively recruiting at the moment in Malaysia and India. If there was sufficient demand, we would be willing to organise a bioperl conference in Cyberjaya at the Cyberview Lodge (www.cyberview-lodge.com), which would be the ideal place for such a conference in Malaysia. Looking forward to your comments, suggestions and proposals. Best regards Jean-Yves Sireau -- Jean-Yves Sireau CEO, Genseq Ltd. www.genseq.co.uk From cjfields at uiuc.edu Sat Jun 2 01:16:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 00:16:05 -0500 Subject: [Bioperl-l] EUtilities overhaul started Message-ID: To anyone using Bio::DB::EUilities, I am in the midst of a major overhaul to the various EUtilities tools and to Bio::DB::GenericWebDBI (the latter which I am forming into more or less a test bed for other database interfaces). I'm about 80% done at this point, and will likely start committing changes this coming week. The overall interface will change (something I had warned about in the Bio::DB::EUtilities POD) but I am hoping it will be more intuitive and easier to use in the long run. I'll describe the overall redesign and use in an upcoming HOWTO (as recommended by Brian a while back). If anyone has any suggestions/ideas/flames, please let me know! Cheers! chris From cjfields at uiuc.edu Sat Jun 2 10:39:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 09:39:25 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: Yes, there are a few odd issues, though that's one I've not heard of yet. You might try one of the sub-nucleotide databases (nuccore, nucest, nucgss). I'll try looking into it and (if necessary) pester NCBI about it. I'll pass this on to the mail list to see if anyone else knows about the problem. chris On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: > Hi Chris, > > Thanks for your work on EUtilities. > For a production task, I used EUtilitities directly (given your > announced overhaul). I noticed a recent problem at NCBI (reported two > weeks ago to NCBI, no reply yet). Possibly you may run into this with > testing: if you ePOST gi ids to the EU server and then use this set in > Esearch (using the query key) no results are returned for the > nucleotide database. > ESearches like "db=$db%23$QueryKey" typically fail if the $db is > nucleotide (but work f $db='protein'). The XML output has Count 0 and > an empty QueryTranslationSet for db=nucleotide only. > For completeness, I attach a simple test script I used. > > > Best regards, > Bernd > > > On 6/2/07, Chris Fields wrote: >> To anyone using Bio::DB::EUilities, >> >> I am in the midst of a major overhaul to the various EUtilities tools >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> more or less a test bed for other database interfaces). I'm about >> 80% done at this point, and will likely start committing changes this >> coming week. >> >> The overall interface will change (something I had warned about in >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> intuitive and easier to use in the long run. I'll describe the >> overall redesign and use in an upcoming HOWTO (as recommended by >> Brian a while back). >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> Cheers! >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Jun 3 00:51:57 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 23:51:57 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu> I can confirm this; however it only relates to the use of history with esearch and nucleotide (use of the history with other eutils seems to work fine); retrieving sequences via efetch is not affected. If I find out anything more I'll post something on the mail list. chris On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote: > I can confirm that using the correct sub-nucleotide database works > (nuccore in my case). > This seems to be a quite recent change/bug at NCBI. Until recently, > db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid > db. > It is not optimal to have to choose the sub-database and the searches > work via the Entrez web-interface. Note that this problem is related > to the ESearch and db=nucleotide. > > bernd > > On 6/2/07, Chris Fields wrote: >> Yes, there are a few odd issues, though that's one I've not heard of >> yet. You might try one of the sub-nucleotide databases (nuccore, >> nucest, nucgss). >> >> I'll try looking into it and (if necessary) pester NCBI about it. >> I'll pass this on to the mail list to see if anyone else knows about >> the problem. >> >> chris >> >> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: >> >> > Hi Chris, >> > >> > Thanks for your work on EUtilities. >> > For a production task, I used EUtilitities directly (given your >> > announced overhaul). I noticed a recent problem at NCBI >> (reported two >> > weeks ago to NCBI, no reply yet). Possibly you may run into this >> with >> > testing: if you ePOST gi ids to the EU server and then use this >> set in >> > Esearch (using the query key) no results are returned for the >> > nucleotide database. >> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is >> > nucleotide (but work f $db='protein'). The XML output has Count >> 0 and >> > an empty QueryTranslationSet for db=nucleotide only. >> > For completeness, I attach a simple test script I used. >> > >> > >> > Best regards, >> > Bernd >> > >> > >> > On 6/2/07, Chris Fields wrote: >> >> To anyone using Bio::DB::EUilities, >> >> >> >> I am in the midst of a major overhaul to the various EUtilities >> tools >> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> >> more or less a test bed for other database interfaces). I'm about >> >> 80% done at this point, and will likely start committing >> changes this >> >> coming week. >> >> >> >> The overall interface will change (something I had warned about in >> >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> >> intuitive and easier to use in the long run. I'll describe the >> >> overall redesign and use in an upcoming HOWTO (as recommended by >> >> Brian a while back). >> >> >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> >> >> Cheers! >> >> >> >> chris >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From basu at pharm.stonybrook.edu Sun Jun 3 10:44:18 2007 From: basu at pharm.stonybrook.edu (Siddhartha Basu) Date: Sun, 03 Jun 2007 10:44:18 -0400 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: On Sat, 2 Jun 2007 00:16:05 -0500 Chris Fields wrote: > To anyone using Bio::DB::EUilities, > > I am in the midst of a major overhaul to the various >EUtilities tools > and to Bio::DB::GenericWebDBI (the latter which I am >forming into > more or less a test bed for other database interfaces). > I'm about > 80% done at this point, and will likely start committing >changes this > coming week. > > The overall interface will change (something I had >warned about in > the Bio::DB::EUtilities POD) but I am hoping it will be >more > intuitive and easier to use in the long run. I'll >describe the > overall redesign and use in an upcoming HOWTO (as >recommended by > Brian a while back). Hi chris, Being a frequent user of EUtilities, hopefully this api facelift and upcoming howto will definitely be more helpful. Anyway, one thing i noticed that for each eutil call such as efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has to be instantiated. And thereafter it cannot be set during runtime such as $eutils->id('ids'), for example.... my $eutils = Bio::DB::Eutilities->new ( -id => $id, -eutil => 'esummary', -db => 'protein', ); my $ct = $eutils->get_response->content(); ## -- now i cannot do this... $eutils->id($newid); my $ct = $eutils->get_response->content(); Is the new api going to address something along this line or is there currently anyway to reuse the object. Thanks again for this nice toolkit. -siddhartha > > If anyone has any suggestions/ideas/flames, please let >me know! > > Cheers! > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Jun 3 19:52:39 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 3 Jun 2007 18:52:39 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu> On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote: > ... > Hi chris, > Being a frequent user of EUtilities, hopefully this api facelift > and upcoming howto will definitely be more helpful. > Anyway, one thing i noticed that for each eutil call such as > efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has > to be > instantiated. And thereafter it cannot be set during runtime such as > $eutils->id('ids'), for example.... > > my $eutils = Bio::DB::Eutilities->new ( -id => $id, > -eutil => 'esummary', > -db => 'protein', > ); > my $ct = $eutils->get_response->content(); > > ## -- now i cannot do this... > $eutils->id($newid); > my $ct = $eutils->get_response->content(); I'll have to check up on that, though changing id() should work with the old API. It won't matter with the new API (it works fine), but it is still troubling... > Is the new api going to address something along this line or is > there currently anyway to reuse > the object. > Thanks again for this nice toolkit. > > -siddhartha The old API was based upon the idea of creating discrete user agents for each eutil to retrieve data. The problem with the old interface is it attempts to do too much (take care of parameters, set up requests, retrieve responses, parse data, etc), and many tasks required instantiating a new EUtilities object. I was never really satisfied with it. The new interface is a composition of three classes: the web user agent (LWP::UserAgent), a class encapsulating parameter handling, and a parser class (all which can be used independently if needed). When parameters change a new request is made 'lazily' (i.e. only when needed). Similarly, when data is requested after any parameter change a new parser instance is created and the new response is parsed. With that in mind you can now do the following: ---------------------------------------- my @params = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA1', -retmax => 100); my $eutil = Bio::DB::EUtilities->new(@params); # no need to get response first; get_ids() calls that if needed my @ids = $eutil->get_ids; # below changes only those parameters, leaves all others set as before $eutil->set_parameters(-eutil => 'efetch', -id => \@ids, -retmode => 'text', -rettype => 'fasta'); # sends streamed content directly to a file $eutil->get_response(-content_file => 'seqs.fas'); # or to a LWP::UserAgent-supported request callback $eutil->get_response(-content_cb => \&my_cb); my @newparams = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA2', -retmax => 100); # Resets eutility to passed parameters (or undef) $eutil->reset_parameters(@newparams); # retrieve new IDs my @new_ids = $eutil->get_ids; ---------------------------------------- Note the same eutil object is used for all of the above, so to answer your last question, yes, you should be able to create data pipelines using the same object if necessary. chris From sac at bioperl.org Mon Jun 4 13:56:57 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 4 Jun 2007 10:56:57 -0700 Subject: [Bioperl-l] question about Bio::Restriction::Analysis In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com> Hi Apurva, I'm cc:ing the list to let others know you have found performance issues with Bio::Restriction::Analysis. Ideally, we should focus on addressing those issues rather than fixing a module that is now deprecated. But taking a quick look at my Bio::Tools::RestrictionEnzyme module, I'm not sure why HpaII would give slower performance relative to other non-ambiguous cutters. This enzyme has a 4-base recognition sequence CCGG, and if you're feeding it a large CG-rich input sequence, that could be a factor. To test, you might try using some other 4-base cutters that aren't CG-rich (TaqI, TasI) or try some other input sequences. There is no special flag to indicate that the enzyme is non-ambiguous. The module handles that automatically. Good luck, Steve On 6/4/07, Apurva Narechania wrote: > Hi Rob and Steve, > > I was hoping you could answer a quick performance question regarding > the Bio::Restriction::Analysis module. I have found that though this > module works well, it is considerably slower than the deprecated > Bio::Tools::RestrictionEnzyme. I see that there are two algorithms > available to your module, and since I am using HpaII, a non-ambiguous > enzyme, I thought I might find similar performance to the older, > deprecated module, but I do not. Is it possible that I am not setting > the non-ambiguous flag correctly? Does it need to be set in the first > place? > > As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have > found instances where it is inaccurate, especially in calculating > fragments of extremely small size 1-5 base pairs, so I would like to > use your module if possible. It just seems slow to me. > > Can you clarify? > > I have copied my code below since it is a short, simple script. > > Thanks! > Apurva Narechania > Ware Lab > Cold Spring Harbor Labs > > ---------- > > #!/usr/bin/perl > > # This program generates a fasta of restriction frags given an > # input fasta and a restriction cut site > > use Getopt::Std; > use Bio::Seq; > use Bio::SeqIO; > use strict; > > use Bio::Tools::RestrictionEnzyme; > > my %opts = (); > getopts ('f:', \%opts); > my $fasta = $opts{'f'}; > > # read fasta file > my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta"); > > my $x = 0; > while (my $sequence_obj = $seqin -> next_seq()){ > $x++; > my $id = $sequence_obj->id(); > > print STDERR "$x Working on $id\n"; > > # generate the rx object > my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII'); > > my @frags = $ra->cut_seq($sequence_obj); > > my $counter = 0; > foreach my $frag (@frags){ > $counter++; > my $length = length ($frag); > print ">$id.$counter length=$length\n$frag\n"; > } > > } > > From anhthu.tieu at gsf.de Tue Jun 5 04:14:09 2007 From: anhthu.tieu at gsf.de (Tieu, Anh-Thu) Date: Tue, 5 Jun 2007 10:14:09 +0200 Subject: [Bioperl-l] problems with image maps and IE 6 or higher Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de> Hi, I have a problem using the bioperl image maps function with the IE6 or and higher browser. It might be a more general problem with IE6 rather than with bioperl, but as I used bioperl to create my image maps, I thought I could still post this problem here and ask for people's opinion. I wondered if anyone else faced the same problem and if possible if anyone could share their experiences and their solutions.

scale alignment5 integration_pt gene intron1 usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/>

> > onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " > alt="scale " target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="alignment5 " alt="alignment5 " > target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="integration_pt " alt="integration_pt " > target="_blank"/> > onclick="javascript:void(zmenu( 'Nphs1 ', > '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', ' > stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " > alt="gene " target="_blank"/> > onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: > 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a > lt="exon1 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: > 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1 > " alt="intron1 " target="_blank"/> > onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: > 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a > lt="exon2 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: > 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2 > .. >
> > > This is part of the code I used in my HTML file to display the image map > and it really runs beautifully > with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 > the clickable pop-ups do not appear/ work. > > I appreciate any help and would like to thank everyone for their help. > > Best regards, > > > Anh-Thu > ________________________________________________________________________ > GSF-Forschungszentrum > > Ingolst?dter Landstr. 1 > > 85764 M?nchen-Neuherberg, Germany > > Chairman of Supervisory Board: MinDir Dr. Peter Lange > > Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum > > Register of Societies: Amtsgericht M?nchen HRB 6466 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Tue Jun 5 11:28:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 10:28:24 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Martin, The example file you give in the bioperl bugzilla report has several blank annotation lines which may lead to additional problems. When the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, DEFINITION, etc) then it expects there will also be relevant data (text descriptions) accompanying it; I assume the BioPython parser expects likewise though I may be wrong. AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- compliant. GenBank records lacking text either have a '.' instead or are left out entirely: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html We could add a fix but you should probably contact the ApE developers and request that field names w/o text be left out or have '.' added. chris On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > Ezequiel Panepucci wrote: >>> genbank entry = parser.parse(fhandle) >> >> there is a space character between "genbank" and "entry". >> It is a syntax error. >> I suppose you meant "genbank_entry" ? > > Yes, the next command was right and has shown the error. Sorry, I > forgot > to delete the first attempt. ;-) > >>>> genbank_entry = parser.parse(fhandle) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", > line 187, in parse > self._scanner.feed(handle, self._consumer) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 360, in feed > self._feed_first_line(consumer, self.line) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 835, in _feed_first_line > assert False, \ > AssertionError: Did not recognise the LOCUS line layout: > LOCUS 6499 bp ds-DNA linear 02-AUG-2006 > >>>> > > Martin > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From stewarta at nmrc.navy.mil Tue Jun 5 11:34:14 2007 From: stewarta at nmrc.navy.mil (Andrew Stewart) Date: Tue, 5 Jun 2007 11:34:14 -0400 Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil> I see bidirectional mutator methods for source, type, strand, etc. in the Bio::DB::GFF::Feature documentation but I see that ->attributes is only able to get and not set the feature attributes. Is there no way to modify the attributes of a Bio::DB::GFF::Feature live? -- Andrew Stewart Research Assistant, Genomics Team Navy Medical Research Center (NMRC) Biological Defense Research Directorate (BDRD) BDRD Annex 12300 Washington Avenue, 2nd Floor Rockville, MD 20852 email: stewarta at nmrc.navy.mil phone: 301-231-6700 Ext 270 From cjfields at uiuc.edu Tue Jun 5 12:07:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 11:07:41 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: One thing I missed which explains the biopython error: the LOCUS line is missing the locus identifier (see the NCBI example record link). This doesn't choke the bioperl parser but it appears to stop the biopython parser in it's tracks (maybe a feature instead of a bug!). You should try adding a unique identifier (maybe the name of the file or record) to the LOCUS line to see if it works: LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 The bioperl parser in CVS writes out the correct alphabet when this is added: LOCUS testfile 6499 bp ds-DNA linear 02- AUG-2006 I'll try adding a warning to the bioperl parser for this. chris On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > Martin, > > The example file you give in the bioperl bugzilla report has several > blank annotation lines which may lead to additional problems. When > the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, > DEFINITION, etc) then it expects there will also be relevant data > (text descriptions) accompanying it; I assume the BioPython parser > expects likewise though I may be wrong. > > AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- > compliant. GenBank records lacking text either have a '.' instead or > are left out entirely: > > http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html > > We could add a fix but you should probably contact the ApE developers > and request that field names w/o text be left out or have '.' added. > > chris > > On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > >> Ezequiel Panepucci wrote: >>>> genbank entry = parser.parse(fhandle) >>> >>> there is a space character between "genbank" and "entry". >>> It is a syntax error. >>> I suppose you meant "genbank_entry" ? >> >> Yes, the next command was right and has shown the error. Sorry, I >> forgot >> to delete the first attempt. ;-) >> >>>>> genbank_entry = parser.parse(fhandle) >> Traceback (most recent call last): >> File "", line 1, in ? >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >> line 187, in parse >> self._scanner.feed(handle, self._consumer) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 360, in feed >> self._feed_first_line(consumer, self.line) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 835, in _feed_first_line >> assert False, \ >> AssertionError: Did not recognise the LOCUS line layout: >> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >> >>>>> >> >> Martin >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Tue Jun 5 22:00:34 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Tue, 05 Jun 2007 22:00:34 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: I am wondering if I knew what this error message exactly meant, if I could discern my error. I don't see much difference in this program and programs that worked. Can I assume that the new worked because an index file exists? I don't know how the filehandle UTR_TT_GENES gets involved. Maybe I should use some other module, but I really would like to have get_Seq_by_id functionality. The error message: Dpse ortholog = Dpse_GA17307 fetching GA17307 Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, line 4. Relevant code: #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; # my $db = Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol ogs_Dpse_genes.fa', -makeid => \&make_my_id); ... ... ... my $pse_obj = $db->get_Seq_by_id('GA17307'); my $pse_sequence = $pse_obj->seq; Nick Staffa Telephone: 919-316-4569 (NIEHS: 6-4569) Scientific Computing Support Group NIEHS Information Technology Support Services Contract (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina From jason at bioperl.org Tue Jun 5 23:12:40 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 5 Jun 2007 20:12:40 -0700 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: the file handle is probably not important, Perl just reports this if there is a filehandle open. more importantly what is on line 84.... my guess is you are trying to get a sequence out and it doesn't exist - some error code around the lines getting the sequence out would be helpful. On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote: > I am wondering if I knew what this error message exactly meant, if > I could > discern my error. > I don't see much difference in this program and programs that worked. > Can I assume that the new worked because an index file exists? > I don't know how the filehandle UTR_TT_GENES gets involved. > Maybe I should use some other module, but I really would like to have > get_Seq_by_id functionality. > > The error message: > Dpse ortholog = Dpse_GA17307 > fetching GA17307 > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl > line 84, > line 4. > > Relevant code: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > # > my $db = > Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ > TT_orthol > ogs_Dpse_genes.fa', > -makeid => \&make_my_id); > ... > ... > ... > my $pse_obj = $db->get_Seq_by_id('GA17307'); > my $pse_sequence = $pse_obj->seq; > > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2613 bytes Desc: not available URL: From torsten.seemann at infotech.monash.edu.au Wed Jun 6 02:06:37 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 6 Jun 2007 16:06:37 +1000 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: Nick, > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, The error makes it pretty clear. You are calling the ->seq method on an undefined value, ie. $pse_obj. > my $pse_obj = $db->get_Seq_by_id('GA17307'); # check we got something! die "sequence not in database" unless $pse_obj; > my $pse_sequence = $pse_obj->seq; -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From shameer at ncbs.res.in Wed Jun 6 02:27:42 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST) Subject: [Bioperl-l] Validation of files using BioPerl Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Dear All, How to validate an input file in fasta/PIR/GenPept/PDB format using Bioperl ? (This is to avoid unnecessary files to be submitted to servers by new users). Any module available ? Many thanks in advance, -- Shameer Khadar From cjfields at uiuc.edu Wed Jun 6 08:37:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 07:37:28 -0500 Subject: [Bioperl-l] Validation of files using BioPerl In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu> It has been discussed but never coded. I believe if it passes through the Bio::SeqIO parser it's generally considered validly formatted (spacing, balanced quotes), though it doesn't specifically check FT keys and qualifiers for invalid ones, look for missing annotation, check taxonomy, etc. As long as the end sequence mark (//) is present for every file, you cold try parsing the file into chunks (read with 'local $/ = '//';') and tossing the seq chunks as a filehandle (via IO::String) to a Bio::SeqIO object wrapped in an eval block (the parser resets $/, so it should work). Follow the eval with a check of $@ for caught errors. It might get tedious for big sequences... chris On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote: > Dear All, > > How to validate an input file in fasta/PIR/GenPept/PDB format using > Bioperl ? (This is to avoid unnecessary files to be submitted to > servers > by new users). Any module available ? > > Many thanks in advance, > -- > Shameer Khadar > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Wed Jun 6 10:40:49 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Wed, 06 Jun 2007 10:40:49 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: Indeed. One must know what is actually in his header, AND one must write the appropriate make_id subroutine AND one must specify the exact ID. THEN things might work. And they did! THANK YOU On 6/6/07 2:06 AM, "Torsten Seemann" wrote: > Nick, > >> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, > > The error makes it pretty clear. You are calling the ->seq method on > an undefined value, ie. $pse_obj. > >> my $pse_obj = $db->get_Seq_by_id('GA17307'); > > # check we got something! > die "sequence not in database" unless $pse_obj; > >> my $pse_sequence = $pse_obj->seq; > From jaudall at gmail.com Wed Jun 6 17:51:33 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:51:33 -0600 Subject: [Bioperl-l] blastxml interation Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being possibly useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number. Thanks in advance for any suggestions. Josh From dmessina at wustl.edu Wed Jun 6 18:18:26 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 6 Jun 2007 17:18:26 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: I think you want to look at the hits(), num_hits() and no_hits_found () methods. There is a private method _next_iteration_index() which should do what you asked for, but num_hits() looks like the better way. By the way, hits() and num_hits() are listed on the Deobfuscator as having no documentation. This (as the below shows) is incorrect and is due to some nonstandard formatting issues which I will correct. _next_iteration_index() isn't listed on the Deobfuscator because it's a private method. Hope this helps! Dave hits() This method overrides Bio::Search::Result::GenericResult::hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, all 'new' hits for all iterations are returned. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::hits num_hits() This method overrides Bio::Search::Result::GenericResult::num_hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, calling num_hits() returns the number of 'new' hits for each iteration. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::num_hits no_hits_found() Usage : $nohits = $blast->no_hits_found( $iteration_number ); Purpose : Get boolean indicator indicating whether or not any hits were present in the report. This is NOT the same as determining the number of hits via the hits() method, which will return zero hits if there were no hits in the report or if all hits were filtered out during the parse. Thus, this method can be used to distinguish these possibilities for hitless reports generated when filtering. Returns : Boolean Argument : (optional) integer indicating the iteration number (PSI- BLAST) If iteration number is not specified and this is a PSI- BLAST result, then this method will return true only if all iterations had no hits found. From apurva at cshl.edu Wed Jun 6 19:51:45 2007 From: apurva at cshl.edu (Apurva Narechania) Date: Wed, 6 Jun 2007 19:51:45 -0400 Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu> Hi, I was hoping you could confirm and give me some feedback on an issue I think I've found with the Bio::Restriction::Analysis module. I am using the enzyme AciI, a non-palindromic restriction enzyme with a 5' C | CGC 3' recognition site. The module should search both the forward and the reverse complement strings in the case of a non- palindromic enzyme. I have found that the this works only intermittently. For example, the following sequence: GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG CGCGGTTG GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG GCTGGTAT TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC AGGACACC GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA CAAAGTGA CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG CAATGTAT ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA TAATGCTA GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC Should digest into 10 fragments using this enzyme, but the module produces only 7. Could you please confirm this behavior, and if observed, suggest some possible fixes? This may be a bug in the _non_pal_enz method, or may be me overlooking something pretty obvious. Thanks, Apurva Narechania. From cjfields at uiuc.edu Wed Jun 6 20:51:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 19:51:00 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: Joshua, Just to make sure there is no confusion, do you mean a Bio::Search::Iteration::IterationI-based object? The iteration tags have multiple meanings apparently in BLAST XML output (multiple queries, multiple PSI-BLAST iterations). The current SearchIO::blastxml parser returns multiple Bio::Search::Result::BlastResult objects based on the iterations, so PSI-BLAST output is treated as multiple BLAST reports regardless (i.e. no Iteration objects). This is something I want to rectify but it may not be a easy fix. chris On Jun 6, 2007, at 5:18 PM, David Messina wrote: > I think you want to look at the hits(), num_hits() and no_hits_found > () methods. There is a private method _next_iteration_index() which > should do what you asked for, but num_hits() looks like the better > way. > > By the way, hits() and num_hits() are listed on the Deobfuscator as > having no documentation. This (as the below shows) is incorrect and > is due to some nonstandard formatting issues which I will correct. > _next_iteration_index() isn't listed on the Deobfuscator because it's > a private method. > > > Hope this helps! > Dave > > > hits() > > This method overrides Bio::Search::Result::GenericResult::hits to take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, all 'new' hits for all iterations > are returned. > These are the hits that did not occur in a previous iteration. > See Also: Bio::Search::Result::GenericResult::hits > > num_hits() > > This method overrides Bio::Search::Result::GenericResult::num_hits to > take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, calling num_hits() returns the > number of > 'new' hits for each iteration. These are the hits that did not occur > in a previous iteration. > See Also: Bio::Search::Result::GenericResult::num_hits > > no_hits_found() > > Usage : $nohits = $blast->no_hits_found( $iteration_number ); > Purpose : Get boolean indicator indicating whether or not any hits > were present in the report. > This is NOT the same as determining the number of > hits via > the hits() method, which will return zero hits if there > were no > hits in the report or if all hits were filtered out > during the parse. > > Thus, this method can be used to distinguish these > possibilities > for hitless reports generated when filtering. > > Returns : Boolean > Argument : (optional) integer indicating the iteration number (PSI- > BLAST) > If iteration number is not specified and this is a PSI- > BLAST result, > then this method will return true only if all > iterations had > no hits found. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 6 20:45:14 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 6 Jun 2007 20:45:14 -0400 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db Message-ID: I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. A schema in PostgreSQL is more or less a namespace for database objects (tables, indexes, views, etc) within a database. (A database in PostgreSQL is similar to the concept of a user in Oracle or MySQL, and therefore for the latter two schemas are synonymous with a user. [Not sure I'm still up-to-date on this for MySQL, but at least that's what I recall.]) When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you specify the schema in which BioSQL resides using the --schema option. If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call also accepts a -schema named parameter, and Bio::DB::DBContextI objects have a $dbc->schema() property for getting/setting the schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may also add the property to the .bioperldb connection parameter file (-schema => 'yourschemahere'). Thanks for Brian Osborne for being the instigator (and tester, and for adding the code to load_ncbi_taxonomy.pl - I came too late). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jaudall at gmail.com Wed Jun 6 17:41:08 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:41:08 -0600 Subject: [Bioperl-l] blastxml interation number Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being very useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number, otherwise I'm suggesting that an iteration_count feature be added to the Result object. Thanks in advance for any suggestions. Josh From holland at ebi.ac.uk Thu Jun 7 03:33:25 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Thu, 07 Jun 2007 08:33:25 +0100 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db In-Reply-To: References: Message-ID: <4667B4C5.6070107@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sounds great. BioJava users shouldn't need to change anything to get this to work as PostgreSQL JDBC connection objects already require you to specify a schema. cheers, Richard Hilmar Lapp wrote: > I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. > A schema in PostgreSQL is more or less a namespace for database objects > (tables, indexes, views, etc) within a database. > > (A database in PostgreSQL is similar to the concept of a user in Oracle > or MySQL, and therefore for the latter two schemas are synonymous with a > user. [Not sure I'm still up-to-date on this for MySQL, but at least > that's what I recall.]) > > When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you > specify the schema in which BioSQL resides using the --schema option. > > If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call > also accepts a -schema named parameter, and Bio::DB::DBContextI objects > have a $dbc->schema() property for getting/setting the schema, > Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may > also add the property to the .bioperldb connection parameter file > (-schema => 'yourschemahere'). > > Thanks for Brian Osborne for being the instigator (and tester, and for > adding the code to load_ncbi_taxonomy.pl - I came too late). > > -hilmar > --=========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij W/+0iO/ZsNDn1pLuf5yXbYA= =asUn -----END PGP SIGNATURE----- From mmokrejs at ribosome.natur.cuni.cz Thu Jun 7 10:26:44 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 07 Jun 2007 16:26:44 +0200 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz> Hi, Chris Fields wrote: > One thing I missed which explains the biopython error: the LOCUS line is > missing the locus identifier (see the NCBI example record link). This > doesn't choke the bioperl parser but it appears to stop the biopython > parser in it's tracks (maybe a feature instead of a bug!). > > You should try adding a unique identifier (maybe the name of the file or > record) to the LOCUS line to see if it works: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > The bioperl parser in CVS writes out the correct alphabet when this is > added: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > I'll try adding a warning to the bioperl parser for this. I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me emphasize the LOCUS line now contains LOCUS pRL 5428 bp ds-DNA linear 07-JUN-2007 which still does not comply with the line you have proposed. But it can be parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new in the bugzilla record #2305. Martin > > chris > > On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > >> Martin, >> >> The example file you give in the bioperl bugzilla report has several >> blank annotation lines which may lead to additional problems. When >> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, >> DEFINITION, etc) then it expects there will also be relevant data >> (text descriptions) accompanying it; I assume the BioPython parser >> expects likewise though I may be wrong. >> >> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- >> compliant. GenBank records lacking text either have a '.' instead or >> are left out entirely: >> >> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html >> >> We could add a fix but you should probably contact the ApE developers >> and request that field names w/o text be left out or have '.' added. >> >> chris >> >> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: >> >>> Ezequiel Panepucci wrote: >>>>> genbank entry = parser.parse(fhandle) >>>> >>>> there is a space character between "genbank" and "entry". >>>> It is a syntax error. >>>> I suppose you meant "genbank_entry" ? >>> >>> Yes, the next command was right and has shown the error. Sorry, I >>> forgot >>> to delete the first attempt. ;-) >>> >>>>>> genbank_entry = parser.parse(fhandle) >>> Traceback (most recent call last): >>> File "", line 1, in ? >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >>> line 187, in parse >>> self._scanner.feed(handle, self._consumer) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 360, in feed >>> self._feed_first_line(consumer, self.line) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 835, in _feed_first_line >>> assert False, \ >>> AssertionError: Did not recognise the LOCUS line layout: >>> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >>> >>>>>> >>> >>> Martin >>> _______________________________________________ >>> BioPython mailing list - BioPython at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biopython >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From cjfields at uiuc.edu Thu Jun 7 11:31:45 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 7 Jun 2007 10:31:45 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> <466815A4.9060505@ribosome.natur.cuni.cz> Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu> On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote: > Hi, > > Chris Fields wrote: >> One thing I missed which explains the biopython error: the LOCUS >> line is missing the locus identifier (see the NCBI example record >> link). This doesn't choke the bioperl parser but it appears to >> stop the biopython parser in it's tracks (maybe a feature instead >> of a bug!). >> You should try adding a unique identifier (maybe the name of the >> file or record) to the LOCUS line to see if it works: >> LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 >> The bioperl parser in CVS writes out the correct alphabet when >> this is added: >> LOCUS testfile 6499 bp ds-DNA linear 02- >> AUG-2006 >> I'll try adding a warning to the bioperl parser for this. > > I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 > but let me > emphasize the LOCUS line now contains > LOCUS pRL 5428 bp ds-DNA linear > 07-JUN-2007 > > > which still does not comply with the line you have proposed. But it > can be > parsed by bioperl-live from cvs. Is it still wrong? Testcase as > pRL.gb-new > in the bugzilla record #2305. > > Martin That should work. There isn't a strict uniqueness test (that would require caching and isn't worth the trouble IMHO), though it's required you add something unique for the accession/locus if you plan on indexing them in the future. Parsing GenBank data produced from third-party software is problematic at best; there seems to be no steadfast rule with GenBank output for some programs, even though the specification is plainly stated in the NCBI release notes. My take on that is to have a stricter (read:follows release notes) GenBank parser which passes off the data in the record to default handler methods. A user could then subjugate the defined handlers with their own by subclassing the default handler class and overloading the methods or adding their own code references directly. chris ... From rich at thevillas.eclipse.co.uk Fri Jun 8 07:00:45 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 12:00:45 +0100 Subject: [Bioperl-l] protparam Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk> Hi, I noticed that in April someone asked whether there was a bioperl mod for obtaining protein sequence related properties using protparam. I have a module that could potentially be submitted to bioperl for this purpose. Does anybody have any thoughts on whether it should go in? Example script and the module are at: http://81.5.159.173/webshare/ Cheers Rich From cjfields at uiuc.edu Fri Jun 8 08:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 07:37:27 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Richard, We'll gladly add this in, though it'll need to be bioperlized (inherit Bio::Root::Root). We also generally ask for tests but it should be easy to write up a quick test suite using any protein seq. If you can could you add some bioperl-like POD to the module (i.e. SYNOPSIS, AUTHOR, DESCRIPTION, etc)? thanks! chris On Jun 8, 2007, at 6:00 AM, richard wrote: > > Hi, > > I noticed that in April someone asked whether there was a bioperl mod > for obtaining protein sequence related properties using protparam. > I have a module that could potentially be submitted to bioperl for > this > purpose. Does anybody have any thoughts on whether it should go in? > > Example script and the module are at: > > http://81.5.159.173/webshare/ > > > Cheers > Rich > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From mmokrejs at ribosome.natur.cuni.cz Fri Jun 8 07:09:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 08 Jun 2007 13:09:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz> Hi, how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for Bio::Graphics::FeatureFile does not help me in this way. The information is in the file, so I want just to extract the features to a GFF format, probably somewhere the sequence has to be stored ... Is there a tool so I can convert it automatically? ;) This would be great. I can't make the GFF manually for every file. Other programs draw plasmid maps also automatically from the GenBank formatted input so how can I do it in bioperl? Thanks for help, Martin From shameer at ncbs.res.in Fri Jun 8 10:11:00 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST) Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in> Richard, I asked for protparam module in bioperl ! Thats a good job. Cheers, SK > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > >> >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From dmessina at wustl.edu Fri Jun 8 10:58:20 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 8 Jun 2007 09:58:20 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Hi Martin, You're in luck -- the BioPerl core distribution includes two scripts for doing just that: genbank2gff genbank2gff3 Look in the scripts directory of the distro. Also, there is a *huge* amount of documentation and examples on the BioPerl website. http://www.bioperl.org/wiki/HOWTOs Reading those, reading the FAQ, and searching the mailing list archives are where I look first when I don't know how to do something in BioPerl. Dave -- Dave Messina Senior Analyst, Assembly Group Genome Sequencing Center Washington University St. Louis, MO From rich at thevillas.eclipse.co.uk Fri Jun 8 11:51:21 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 16:51:21 +0100 Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk> Hi, ok, great, that's no problem. I'll add the POD and bioperlize it, thanks Rich Chris Fields wrote: > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > > >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Fri Jun 8 13:45:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 12:45:17 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> <46697AF9.2090502@thevillas.eclipse.co.uk> Message-ID: Another issue is namespace. I suggest Bio::Tools::ProtParam, though there may be some others out there. We can add support for direct Bio::Seq/PrimarySeq input and other odds and ends once it's committed. Good work! chris On Jun 8, 2007, at 10:51 AM, richard wrote: > > Hi, > > ok, great, that's no problem. I'll add the POD and bioperlize it, > > thanks > Rich > > Chris Fields wrote: >> Richard, >> >> We'll gladly add this in, though it'll need to be bioperlized >> (inherit Bio::Root::Root). We also generally ask for tests but it >> should be easy to write up a quick test suite using any protein seq. >> >> If you can could you add some bioperl-like POD to the module (i.e. >> SYNOPSIS, AUTHOR, DESCRIPTION, etc)? >> >> thanks! >> >> chris >> >> On Jun 8, 2007, at 6:00 AM, richard wrote: >> >> >>> Hi, >>> >>> I noticed that in April someone asked whether there was a bioperl >>> mod >>> for obtaining protein sequence related properties using protparam. >>> I have a module that could potentially be submitted to bioperl for >>> this >>> purpose. Does anybody have any thoughts on whether it should go in? >>> >>> Example script and the module are at: >>> >>> http://81.5.159.173/webshare/ >>> >>> >>> Cheers >>> Rich >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 11 07:30:24 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 11 Jun 2007 07:30:24 -0400 Subject: [Bioperl-l] script to load ITIS taxonomy Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Hi all - I added a script to load the ITIS taxonomy (www.itis.gov) into the phylodb module. It is called load_itis_taxonomy.pl and is in the scripts/ directory. It is independent of BioPerl right now (the ITIS download is either a MS SQL Server or an Informix dump - no kidding), but I'm hoping that at some point support for this can be integrated into Bio::TreeIO. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 11 08:24:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 11 Jun 2007 07:24:50 -0500 Subject: [Bioperl-l] script to load ITIS taxonomy In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu> On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote: > Hi all - > > I added a script to load the ITIS taxonomy (www.itis.gov) into the > phylodb module. It is called load_itis_taxonomy.pl and is in the > scripts/ directory. > > It is independent of BioPerl right now (the ITIS download is either a > MS SQL Server or an Informix dump - no kidding), but I'm hoping that > at some point support for this can be integrated into Bio::TreeIO. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== I second the TreeIO support. Anyone up for it? chris From ryanx07 at hotmail.com Mon Jun 11 11:24:31 2007 From: ryanx07 at hotmail.com (L Xu) Date: Mon, 11 Jun 2007 10:24:31 -0500 Subject: [Bioperl-l] basic questions Message-ID: I just started to learn BioPerl by reading the BioPerl Tutorial on the BioPerl website. By trying the 1st example on my window, use Bio::Perl; $seq_object = get_sequence('swiss',"ID ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); I got the error as the following: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: t8.pl:7 I cannot figure out where is wrong but cannot find the solution on the web. Could someone help me please? Also, this lead to my 2nd question: is there a way to search in the archieve of the current list? Thanks so much R ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Like puzzles? Play free games & earn great prizes. Play Clink now. http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2 From dmessina at wustl.edu Mon Jun 11 12:34:29 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 11:34:29 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu> The example code works here, but I'm on OS X. Could you tell us which version of Perl and BioPerl you are using, and which operating system? Are you getting anything in the roa1.fasta file? > is there a way to search in the archieve of the current list? http://www.bioperl.org/wiki/Mailing_lists Dave From dmessina at wustl.edu Mon Jun 11 14:48:23 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 13:48:23 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Hi, Please use 'Reply All' so everyone on the list can follow the discussion. Try adding the following line after the line that starts with $seq_object: print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; And then run the program again. What do you get? Could you post a complete printout of what you're doing? Dave On Jun 11, 2007, at 11:45 AM, L Xu wrote: > I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > activeperl 5.8.8.819 Thank you very much. From johnsonm at gmail.com Mon Jun 11 20:45:13 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 11 Jun 2007 19:45:13 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) Message-ID: This bit in Bio::SeqFeature::Gene::Exon is causing me some problems trying to extend Bio::Tools::Glimmer to handle 'wraparound' genes (circular genomes): sub location { my ($self,$value) = @_; if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) { $self->throw("split or compound location is not allowed ". "for an object of type " . ref($self)); } return $self->SUPER::location($value); } That seems to be there all the way back to the initial revision (checked in by Hilmar). I presume it's there because of code like this ( from the seq() method in Bio::SeqFeature::Generic): # assumming our seq object is sensible, it should not have to yank # the entire sequence out here. my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); That's not going to work too well with a feature that has a Bio::Location::Split location. Fixing it up seems straightforward, if a bit hackish. Something like: my $seq; if (ref($self->location()) eq 'Bio::Location::Split')) { my $seqstring; my @sublocs = $self->location()->sub_Location(); foreach my $subloc (@sublocs) { $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), $subloc->end())->seq(); } my $seq = Bio::Seq->new( -id => $self->{'_gsf_seq'}->display_id(), -seq => $seqstring ); } else { $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); } I don't see any companion to trunc() in Bio::PrimarySeqI for joining sequences. A join() would be handy, and make the above cleaner. Comments, suggestions, rotten fruit? From torsten.seemann at infotech.monash.edu.au Tue Jun 12 02:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 12 Jun 2007 16:18:27 +1000 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: Mark, > if (ref($self->location()) eq 'Bio::Location::Split')) { > my $seqstring; > my @sublocs = $self->location()->sub_Location(); > > foreach my $subloc (@sublocs) { > $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), > $subloc->end())->seq(); > } Can you use the ->spliced_seq() method to do this? http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From pengchy at yahoo.com.cn Tue Jun 12 03:00:46 2007 From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=) Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST) Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com> hi all, Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141 Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, < DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. shell returned 2 when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond: TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr x/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. Is anyone else meet the same problem? Is it a bug for TFBS package? Best wishes! Sincerely, Pengcheng --------------------------------- ????????????????3.5G??????20M?????? From bix at sendu.me.uk Tue Jun 12 03:32:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 12 Jun 2007 08:32:02 +0100 Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com> References: <66745.92089.qm@web15205.mail.cnb.yahoo.com> Message-ID: <466E4BF2.7020504@sendu.me.uk> ? ?? wrote: > hi all, > > Today, I download the TFBS package from > http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the > files contained in the TFBS and Ext directories to directory > "C:\perl\site\lib", then put Ext under the TFBS directory. I run the > example script1.pl, but a wrong message respond: > > Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC You have to follow the installation instructions in the README file. Copying the files out is insufficient - you have to 'make'. From ryanx07 at hotmail.com Tue Jun 12 07:30:09 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 06:30:09 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Message-ID: Here is the code: use Bio::Perl; $seq_object = get_sequence('swiss',"ROA1_HUMAN"); print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; write_sequence(">roa1.fasta",'fasta',$seq_object); The output looks like the same as the previous version: Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\~Scripts>perl test.pl ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: test.pl:7 ----------------------------------------------------------- Thanks. >From: David Messina >To: L Xu >CC: BioPerl list >Subject: Re: [Bioperl-l] basic questions >Date: Mon, 11 Jun 2007 13:48:23 -0500 > >Hi, > >Please use 'Reply All' so everyone on the list can follow the discussion. > >Try adding the following line after the line that starts with $seq_object: > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > >And then run the program again. What do you get? Could you post a complete >printout of what you're doing? > > >Dave > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and >>activeperl 5.8.8.819 Thank you very much. > _________________________________________________________________ Picture this ? share your photos and you could win big! http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us From pengchy at yahoo.com.cn Tue Jun 12 10:33:15 2007 From: pengchy at yahoo.com.cn (Pengcheng Yang) Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?= In-Reply-To: Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com> I got the same questions. I guess that the swissprote database has some problems! code: use Bio::DB::SwissProt; $sp = new Bio::DB::SwissProt; $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" the mesage: ------------- EXCEPTION ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180 STACK Bio::DB::WebDBSeqI::get_Seq_by_id C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154 STACK toplevel t.pl:7 -------------------------------------- --- L Xu ????: > Here is the code: > > use Bio::Perl; > $seq_object = get_sequence('swiss',"ROA1_HUMAN"); > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > write_sequence(">roa1.fasta",'fasta',$seq_object); > > The output looks like the same as the previous version: > > Microsoft Windows XP [Version 5.1.2600] > (C) Copyright 1985-2001 Microsoft Corp. > > C:\~Scripts>perl test.pl > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK: Error::throw > STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 > STACK: Bio::SeqIO::swiss::next_seq > C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id > C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 > 3 > STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 > STACK: test.pl:7 > ----------------------------------------------------------- > > Thanks. > > > > > > >From: David Messina > >To: L Xu > >CC: BioPerl list > >Subject: Re: [Bioperl-l] basic questions > >Date: Mon, 11 Jun 2007 13:48:23 -0500 > > > >Hi, > > > >Please use 'Reply All' so everyone on the list can follow the > discussion. > > > >Try adding the following line after the line that starts with > $seq_object: > > > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > > > >And then run the program again. What do you get? Could you post a > complete > >printout of what you're doing? > > > > > >Dave > > > > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: > >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > >>activeperl 5.8.8.819 Thank you very much. > > > > _________________________________________________________________ > Picture this ?share your photos and you could win big! > http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Best wishes! Sincerely, Pengcheng ___________________________________________________________ ????????????????3.5G??????20M?????? http://cn.mail.yahoo.com From drummike at gmail.com Tue Jun 12 11:49:36 2007 From: drummike at gmail.com (Mike Williams) Date: Tue, 12 Jun 2007 11:49:36 -0400 Subject: [Bioperl-l] =?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?= In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com> References: <936780.8655.qm@web15215.mail.cnb.yahoo.com> Message-ID: On 6/12/07, Pengcheng Yang wrote: > I got the same questions. > I guess that the swissprote database has some problems! > code: > use Bio::DB::SwissProt; > $sp = new Bio::DB::SwissProt; > $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); > print ref($seq),"\t",$seq->display_id,"\n" > ------------- EXCEPTION ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK toplevel t.pl:7 This is a different problem. The id was not valid. If you change KPY1 to KPYK1 it works fine. $seq = $sp->get_Seq_by_id('KPYK1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" [mike at Wheatley]$ ./bio_quest2.pl Bio::Seq::RichSeq KPYK1_ECOLI If you got this example from the bio perl site would you please post the url? Seems to me this same problem has come up before, but I could not find it in the archives nor on the web site. Mike From ryanx07 at hotmail.com Tue Jun 12 11:42:28 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 10:42:28 -0500 Subject: [Bioperl-l] basic questions Message-ID: I tested another code (the 2nd test on the same machine) from the tutorial and got error again. I don't know what happened and please help. Thanks so much. ===========================================================Code: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection; my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; # prints name, recognition site, overhang } =========================================== Results: C:\~Scripts>perl t9.pl Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while "stric t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 236. = = = Original message = = = On Jun 11, 2007, at 11:45 AM, L Xu wrote: I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? activeperl 5.8.8.819 Thank you very much. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Need a break? Find your escape route with Live Search Maps. http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01 From limericksean at gmail.com Tue Jun 12 12:04:40 2007 From: limericksean at gmail.com (Sean O'Keeffe) Date: Tue, 12 Jun 2007 18:04:40 +0200 Subject: [Bioperl-l] gff2xml Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Hi all, I posted this on the gbrowse list earlier. I'm looking to convert gff data files into xml. Does anyone know of a module written to do this already? respect, sean. From johnsonm at gmail.com Tue Jun 12 12:10:45 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:10:45 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On 6/12/07, Torsten Seemann wrote: > Can you use the ->spliced_seq() method to do this? > > http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > --Tel +61 3 9905 9010 Actually, I'd forgotten about spliced_seq(). That seems like it will Do The Right Thing. It's just up to the invoker to call spliced_seq() instead of seq() as appropriate. So, is there any other code that will break if I modify Bio::SeqFeature::Gene::Exon::location to not throw an exception when encountering Bio::Location::SplitLocationI? I'm wondering if it's just a paranoid check or if it's there to guard against something. If the latter, I need to know what code to fix. I'll dig and look, but if anybody knows or has an idea, save me some time. I suppose I can just change it and see what tests start failing. 8) From dmessina at wustl.edu Tue Jun 12 12:11:36 2007 From: dmessina at wustl.edu (David Messina) Date: Tue, 12 Jun 2007 11:11:36 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu> Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps Perl wasn't seeing the second argument to get_sequence. And then your new program has the error 'Can't use string ("Bio::Restriction::EnzymeCollecti")' where the end of the word is cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks. Are there any example scripts that come with ActivePerl? If there are, and they run correctly, perhaps you could look to see how the line breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem -- anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl and make sure that you run the full test suite and that all of the tests pass. My guess is that something in your current setup is not quite right. Dave From cjfields at uiuc.edu Tue Jun 12 12:42:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 11:42:29 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs state that the Exon class is used to specifically describe exons, as the name implies. Exons are primarily eukaryotic in origin, so you shouldn't encounter wraparounds, and should not have split locations by definition (which likely explains the exception). Wouldn't a SeqFeature::Generic work just as well using a split location? chris From johnsonm at gmail.com Tue Jun 12 12:59:54 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:59:54 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: That's a good point. Both Bio::Tools::Glimmer and Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with a single Bio::SeqFeature::Gene::Exon, when parsing predictions for prokaryotic sequence (multiple exons for eukaryotic). There are eukaryotic and prokaryotic versions of both predictor families. Maybe the most elegant solution would be to simply modify both modules to only emit Bio::SeqFeature::Generic features when operating on prokaryotic mode output? Fix the data model and the problem goes away. 8) On 6/12/07, Chris Fields wrote: > > On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > > > On 6/12/07, Torsten Seemann > > wrote: > >> Can you use the ->spliced_seq() method to do this? > >> > >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ > >> SeqFeatureI.html#POD11 > >> > >> -- > >> --Torsten Seemann > >> --Victorian Bioinformatics Consortium, Monash University > >> --Tel +61 3 9905 9010 > > > > Actually, I'd forgotten about spliced_seq(). That seems like it > > will Do The Right Thing. It's just up to the invoker to call > > spliced_seq() instead of seq() as appropriate. > > So, is there any other code that will break if I modify > > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > > encountering Bio::Location::SplitLocationI? I'm wondering if it's > > just a paranoid check or if it's there to guard against something. If > > the latter, I need to know what code to fix. I'll dig and look, but > > if anybody knows or has an idea, save me some time. I suppose I can > > just change it and see what tests start failing. 8) > > I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to > describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs > state that the Exon class is used to specifically describe exons, as > the name implies. Exons are primarily eukaryotic in origin, so you > shouldn't encounter wraparounds, and should not have split locations > by definition (which likely explains the exception). > > Wouldn't a SeqFeature::Generic work just as well using a split location? > > chris > From ryanx07 at hotmail.com Tue Jun 12 13:17:18 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 12:17:18 -0500 Subject: [Bioperl-l] basic questions Message-ID: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820. However, both scripts generated the same error with my computer. I tested the code in another WinXP computer with the same versions of activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there any example scripts that come with ActivePerl? If there are,? and they run correctly, perhaps you could look to see how the line? breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl? and make sure that you run the full test suite and that all of the? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 13:51:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 12:51:47 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: This is an instance where 'use strict' would have shown the problem right away. You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: > I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 > build 820. > However, both scripts generated the same error with my computer. I > tested > the code in another WinXP computer with the same versions of > activePerl and > BioPerl, the one for the swissprot did work but the restriction enzyme > generated the same error. > > = = = Original message = = = > > Hmm, it almost looks like you're having an issue with line breaks. > > The 'swissprot stream with no ID' error made me think that perhaps? > Perl > wasn't seeing the second argument to get_sequence. And then your? new > program has the error 'Can't use string? > ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? > cut off. > > I don't know how ActivePerl handles Windows vs UNIX line breaks.? > Are? there > any example scripts that come with ActivePerl? If there are,? and > they run > correctly, perhaps you could look to see how the line? breaks are > done and > make sure the your program does it the same way. > > Other than that, I'm not seeing an obvious answer to your problem > --? anyone > else have a suggestion? > > Perhaps the easiest thing for you to do would be to reinstall > BioPerl? and > make sure that you run the full test suite and that all of the? > tests pass. > My guess is that something in your current setup is not? quite right. > > Dave > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only > on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Tue Jun 12 14:11:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 13:11:15 -0500 Subject: [Bioperl-l] basic questions Message-ID: Thank you very much, it did make the script advanced a bit but I got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the package. Thanks. = = = Original message = = = This is an instance where 'use strict' would have shown the problem? right away.? You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 820. However, both scripts generated the same error with my computer. I? tested the code in another WinXP computer with the same versions of? activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps?? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? there any example scripts that come with ActivePerl? If there are,? and? they run correctly, perhaps you could look to see how the line? breaks are? done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem? --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and make sure that you run the full test suite and that all of the?? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only? on MSN http://liveearth.msn.com?source=msntaglineliveearthhm _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 14:35:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 13:35:15 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu> Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme objects, each with its own name(). Using grouped methods like '$collection->cutters(6)' will retrieve a new EnzymeCollection containing all six-cutters from the original collection. You should use one of the EnzymeCollection accessor methods to retrieve the enzyme that you wanted first or iterate through them all. This works for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; } chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: > Thank you very much, it did make the script advanced a bit but I > got the following error: > > C:\~Scripts>perl t9.pl > Can't locate object method "name" via package > "Bio::Restriction::EnzymeCollectio > n" at t9.pl line 5, line 532. > > I checked the documentation , there is no "name" method for the > package. Thanks. From johnsonm at gmail.com Tue Jun 12 15:07:57 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 14:07:57 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: I'll wait a day, and if there is no opinion to the contrary, implement it this way. On 6/12/07, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) From torsten.seemann at infotech.monash.edu.au Tue Jun 12 20:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 13 Jun 2007 10:18:27 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: Sean > I posted this on the gbrowse list earlier. I'm looking to convert gff > data files into xml. Does anyone know of a module written to do this > already? What DTD do you want the XML to conform to? eg. ChadoXML, TinySeq XML, TIGR XML ... ? What program are you trying to get to load the XML? BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that you could use. There is a script "bp_seqconvert.pl -h" which comes with BioPerl which may be useful. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From hlapp at gmx.net Tue Jun 12 20:55:57 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:55:57 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net> I think it was just trying to guard against people trying to do stupid things. I'm actually not sure that representing locations on a circular genome using split locations really is the best thing. I'm wondering whether one shouldn't rather introduce a CircularLocation object (though obviously it isn't the location that's circular...). Just a thought. In the end, if you have a way to make this work that you feel comfortable with than go for it. -hilmar On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Jun 12 20:57:06 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:57:06 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> I like that. Don't force a model to do what you want if it doesn't really apply anyway. -hilmar On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) > > On 6/12/07, Chris Fields wrote: >> >> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >> >>> On 6/12/07, Torsten Seemann >>> wrote: >>>> Can you use the ->spliced_seq() method to do this? >>>> >>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>> SeqFeatureI.html#POD11 >>>> >>>> -- >>>> --Torsten Seemann >>>> --Victorian Bioinformatics Consortium, Monash University >>>> --Tel +61 3 9905 9010 >>> >>> Actually, I'd forgotten about spliced_seq(). That seems like it >>> will Do The Right Thing. It's just up to the invoker to call >>> spliced_seq() instead of seq() as appropriate. >>> So, is there any other code that will break if I modify >>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when >>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>> just a paranoid check or if it's there to guard against >>> something. If >>> the latter, I need to know what code to fix. I'll dig and look, but >>> if anybody knows or has an idea, save me some time. I suppose I can >>> just change it and see what tests start failing. 8) >> >> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >> state that the Exon class is used to specifically describe exons, as >> the name implies. Exons are primarily eukaryotic in origin, so you >> shouldn't encounter wraparounds, and should not have split locations >> by definition (which likely explains the exception). >> >> Wouldn't a SeqFeature::Generic work just as well using a split >> location? >> >> chris >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Jun 12 21:20:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 20:20:41 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> References: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu> It will be interesting to see if bioperl handles wrap-around split locations via spliced_seq() and other methods. I can't see why it wouldn't but one never knows. Might be something to add to location tests at some point... chris On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote: > I like that. Don't force a model to do what you want if it doesn't > really apply anyway. > > -hilmar > > On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > >> That's a good point. Both Bio::Tools::Glimmer and >> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with >> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for >> prokaryotic sequence (multiple exons for eukaryotic). There are >> eukaryotic and prokaryotic versions of both predictor families. >> Maybe >> the most elegant solution would be to simply modify both modules to >> only emit Bio::SeqFeature::Generic features when operating on >> prokaryotic mode output? Fix the data model and the problem goes >> away. 8) >> >> On 6/12/07, Chris Fields wrote: >>> >>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >>> >>>> On 6/12/07, Torsten Seemann >>>> wrote: >>>>> Can you use the ->spliced_seq() method to do this? >>>>> >>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>>> SeqFeatureI.html#POD11 >>>>> >>>>> -- >>>>> --Torsten Seemann >>>>> --Victorian Bioinformatics Consortium, Monash University >>>>> --Tel +61 3 9905 9010 >>>> >>>> Actually, I'd forgotten about spliced_seq(). That seems >>>> like it >>>> will Do The Right Thing. It's just up to the invoker to call >>>> spliced_seq() instead of seq() as appropriate. >>>> So, is there any other code that will break if I modify >>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception >>>> when >>>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>>> just a paranoid check or if it's there to guard against >>>> something. If >>>> the latter, I need to know what code to fix. I'll dig and look, >>>> but >>>> if anybody knows or has an idea, save me some time. I suppose I >>>> can >>>> just change it and see what tests start failing. 8) >>> >>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >>> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >>> state that the Exon class is used to specifically describe exons, as >>> the name implies. Exons are primarily eukaryotic in origin, so you >>> shouldn't encounter wraparounds, and should not have split locations >>> by definition (which likely explains the exception). >>> >>> Wouldn't a SeqFeature::Generic work just as well using a split >>> location? >>> >>> chris >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Wed Jun 13 08:16:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 07:16:15 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: Thanks so much, Chris, it works now. All the codes I tested were copied from Bioperl Tutorial. Why did they have such problems, because of the platform issue or different versions of BioPerl? I tested so far 6 scripts, three work and three don't. Here is the problem for the 3rd failed script: ================================= use strict; use Bio::Tools::Run::RemoteBlast; my $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); my $r = $remote_blast->submit_blast("d1.fa"); my $rc; while ( my @rids = $remote_blast->each_rid ) { for my $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); } } print "$rc\n"; #I just want to print sth here before parsing the result =========================================================d1.fa >example CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC =========================================================result C:\>perl t13.pl -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- Terminating on signal SIGINT(2) C:\> Please help me to correct the problem, thanks. = = = Original message = = = Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, each with its own name().? Using grouped methods like? '$collection->cutters(6)' will retrieve a new EnzymeCollection? containing all six-cutters from the original collection.? You should? use one of the EnzymeCollection accessor methods to retrieve the? enzyme that you wanted first or iterate through them all.? This works? for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme) ?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: Thank you very much, it did make the script advanced a bit but I? got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package? "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the? package. Thanks. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Make every IM count. Download Messenger and join the i?m Initiative now. It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07 From cjfields at uiuc.edu Wed Jun 13 10:41:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 09:41:55 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu> Judging by the output it looks like you have no network access or can't connect to the server (what remoteblast needs). Make sure you don't need proxy settings. To preempt the next question, no, I'm not going to explain what a proxy is. The RemoteBlast docs show how to set them, and Google is a wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... From ryanx07 at hotmail.com Wed Jun 13 11:01:07 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 10:01:07 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: I do have the internet connection bu not use the proxy server. I tested the network connection with ping command (below). The ncbi website does not response. Is there any special network setting needed for connecting the ncbi website? Thank you so much. C:\>ping www.yahoo.com Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 Ping statistics for 69.147.114.210: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 312ms, Maximum = 363ms, Average = 338ms C:\>ping www.ncbi.nlm.nih.gov Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: Request timed out. Request timed out. Request timed out. Request timed out. Ping statistics for 130.14.29.110: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), = = = Original message = = = Judging by the output it looks like you have no network access or? can't connect to the server (what remoteblast needs).? Make sure you? don't need proxy settings. To preempt the next question, no, I'm not going to explain what a? proxy is.? The RemoteBlast docs show how to set them, and Google is a? wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: ... -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- ... ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Wed Jun 13 12:14:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 11:14:22 -0500 Subject: [Bioperl-l] method naming Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Some quick questions on method naming. I couldn't find this on the mail list previously and just want some opinions. 1) Is there any preference on how to name a method that returns a list of class instances vs. data? I have seen 'each' (each_Location, each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. simple (hits, hsps). 2) Do we want have methods which return objects have the object name in Title Case (each_Location, get_Seq_by_id, etc) or does it really matter? chris From dmessina at wustl.edu Wed Jun 13 12:41:53 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 13 Jun 2007 11:41:53 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). I'd prefer 'get_all' because it's more intuitive to me what the method is doing. 'Each' is too programmer-y. > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? I like Title Case because it reinforces the notion that what you're getting back is a specific object with that name (Seq) rather than the generic thing that the name represents (AGTCTGTGATAT, the actual sequence as a string). Dave From hlapp at gmx.net Wed Jun 13 13:03:59 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 13:03:59 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> We set a convention a while back on how to name these. It is implemented in the bioperl.lisp file (too bad no one is using emacs any more these days - it's a great editor), and in fact we started a renaming campaign (not sure when that was) on the SeqI and SeqFeatureI classes (you'll still see the old names aliased). However, we never got to finish the clean up. The convention was to use get_{ClassName}s, and get_all_{ClassName}s if there is a difference to the former (mostly because of hierarchical data; for example features can be nested, and get_all_SeqFeatures returns them all flattened out, while get_SeqFeatures returns only the top objects), and for modifying add_ {ClassName} and remove_{ClassName}s. The class name was to be in title case to emphasize the fact that it is an array of object you'd be getting back (and what kind of objects). If it is strings or any other scalar type, the name would be in lower case. -hilmar On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > Some quick questions on method naming. I couldn't find this on the > mail list previously and just want some opinions. > > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). > > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 13:19:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 12:19:43 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: Sounds good. I agree with Dave also one the use of 'each', as it's a bit ambiguous (seems to imply iteration as opposed to returning a whole list). We probably need to post this somewhere on the wiki for future reference; maybe in Advanced BioPerl? I'll add this in shortly. chris On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), and in fact we started > a renaming campaign (not sure when that was) on the SeqI and > SeqFeatureI classes (you'll still see the old names aliased). > > However, we never got to finish the clean up. > > The convention was to use get_{ClassName}s, and get_all_{ClassName} > s if there is a difference to the former (mostly because of > hierarchical data; for example features can be nested, and > get_all_SeqFeatures returns them all flattened out, while > get_SeqFeatures returns only the top objects), and for modifying > add_{ClassName} and remove_{ClassName}s. > > The class name was to be in title case to emphasize the fact that > it is an array of object you'd be getting back (and what kind of > objects). If it is strings or any other scalar type, the name would > be in lower case. > > -hilmar > > On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > >> Some quick questions on method naming. I couldn't find this on the >> mail list previously and just want some opinions. >> >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> >> 2) Do we want have methods which return objects have the object name >> in Title Case (each_Location, get_Seq_by_id, etc) or does it really >> matter? >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Jun 13 14:43:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 13:43:41 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <467036FC.8000505@watson.wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> <467036FC.8000505@watson.wustl.edu> Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu> On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote: > > David Messina wrote: >>> 1) Is there any preference on how to name a method that returns a >>> list of class instances vs. data? I have seen >>> 'each' (each_Location, >>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) >>> vs. >>> simple (hits, hsps). >>> >> >> I'd prefer 'get_all' because it's more intuitive to me what the >> method is doing. 'Each' is too programmer-y. >> >> >> > When I think 'get_all', I think of a method that returns a list of > objects at once. When I think of 'each', I think of a method that > returns a scalar but can be called multiple times to iterate over a > set of objects. Yep, hence the ambiguity issue (and my confusion). I think it was so you could both iterate and return a list using this: for my $obj ($seq->each_Class) {...} my @objs = $seq->each_Class; I use 'next' and 'get/get_all' as an iterator and get accessor (similar to how it's used in Bio::SearchIO): while (my $obj = $seq->next_Class) {...} my @objs = $seq->get_Class; # or get_all_Class for flattened lists which to me is much clearer. chris From mkiwala at watson.wustl.edu Wed Jun 13 14:27:08 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Wed, 13 Jun 2007 13:27:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> Message-ID: <467036FC.8000505@watson.wustl.edu> David Messina wrote: >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> > > I'd prefer 'get_all' because it's more intuitive to me what the > method is doing. 'Each' is too programmer-y. > > > When I think 'get_all', I think of a method that returns a list of objects at once. When I think of 'each', I think of a method that returns a scalar but can be called multiple times to iterate over a set of objects. From sac at bioperl.org Wed Jun 13 17:17:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 13 Jun 2007 14:17:27 -0700 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> On 6/13/07, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we could improve the visibility of bioperl.lisp. In truth, I had forgotten about it, though lit turns out I was loading an old version of it. (Btw, using the latest version of bioperl.lisp with xemacs 21.4.17, I don't get a bioperl menu item, though I can access bioperl functions via M-x. Suggestions?) I see bioperl.lisp is mentioned twice parenthetically in the advanced bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here would help. While we're at it, maybe we could add a bioperl.vi file to the distribution (if you can do such things with vi/vim). On 6/13/07, Chris Fields wrote: > We probably need to post this somewhere on the wiki for future > reference; maybe in Advanced BioPerl? I'll add this in shortly. Another idea: Add a method naming check to the set of audits we perform on CVS committed code. It could check for agreement with our conventions and warn if nothing was found (may not be a problem though). Steve From arareko at campus.iztacala.unam.mx Wed Jun 13 18:03:34 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 13 Jun 2007 17:03:34 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <467069B6.7080003@campus.iztacala.unam.mx> By the time of the 1.5.2 release, I jumped onto the idea of creating a BioPerl template for Komodo. Chris F handed me one he had already made but in the end I didn't had enough spare time to get into it. If someone wants to give it a try please let ChrisF/me know. Regards, Mauricio. Steve Chervitz wrote: > On 6/13/07, Hilmar Lapp wrote: >> We set a convention a while back on how to name these. It is >> implemented in the bioperl.lisp file (too bad no one is using emacs >> any more these days - it's a great editor), > > As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we > could improve the visibility of bioperl.lisp. In truth, I had > forgotten about it, though lit turns out I was loading an old version > of it. (Btw, using the latest version of bioperl.lisp with xemacs > 21.4.17, I don't get a bioperl menu item, though I can access bioperl > functions via M-x. Suggestions?) > > I see bioperl.lisp is mentioned twice parenthetically in the advanced > bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here > would help. While we're at it, maybe we could add a bioperl.vi file to > the distribution (if you can do such things with vi/vim). > > On 6/13/07, Chris Fields wrote: >> We probably need to post this somewhere on the wiki for future >> reference; maybe in Advanced BioPerl? I'll add this in shortly. > > Another idea: Add a method naming check to the set of audits we > perform on CVS committed code. It could check for agreement with our > conventions and warn if nothing was found (may not be a problem > though). > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From hlapp at gmx.net Wed Jun 13 18:41:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 18:41:45 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > using the latest version of bioperl.lisp with xemacs 21.4.17, I > don't get a bioperl menu item I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item it showing up just beautifully. (BTW it also have very nice icons for various functions - though I always feel guilty for using keystrokes instead.) Is GNU Emacs finally winning this? ;) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Wed Jun 13 18:58:51 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 15:58:51 -0700 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Post your dualing screenshots to the wiki! I had started a couple of IDE pages on the wiki a while ago: http://bioperl.org/wiki/Emacs http://bioperl.org/wiki/Emacs_template http://bioperl.org/wiki/Vi If anyone is feeling excited enough to write a few more IDE pages and link them into a common article that would be great. -jason On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > > On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > >> using the latest version of bioperl.lisp with xemacs 21.4.17, I >> don't get a bioperl menu item > > I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item > it showing up just beautifully. (BTW it also have very nice icons for > various functions - though I always feel guilty for using keystrokes > instead.) > > Is GNU Emacs finally winning this? ;) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Wed Jun 13 19:08:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:08:17 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: Would probably be worth writing one up for Komodo since Mauricio, Sendu, and I use it. I updated the Advanced BioPerl page with Hilmar's methods suggestions/ rules (as well as a few I found dating back a number of years on the mail list). It might be worth a glance in case there are any changes needed: http://www.bioperl.org/wiki/Advanced_BioPerl chris On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > Post your dualing screenshots to the wiki! > > I had started a couple of IDE pages on the wiki a while ago: > http://bioperl.org/wiki/Emacs > http://bioperl.org/wiki/Emacs_template > http://bioperl.org/wiki/Vi > > If anyone is feeling excited enough to write a few more IDE pages > and link them into a common article that would be great. > > -jason > On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > >> >> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >> >>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>> don't get a bioperl menu item >> >> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item >> it showing up just beautifully. (BTW it also have very nice icons for >> various functions - though I always feel guilty for using keystrokes >> instead.) >> >> Is GNU Emacs finally winning this? ;) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 13 19:28:17 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 19:28:17 -0400 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Thanks Chris for doing this - looks great. The only comment that I have is that method names should never start with a capital letter. If the getter/setter is for a single object (as opposed to a list), the name should probably be similar (if not identical) to the class being expected and returned, but lower-case. E.g., $feature->location(), $seq->species() etc -hilmar On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > Would probably be worth writing one up for Komodo since Mauricio, > Sendu, and I use it. > > I updated the Advanced BioPerl page with Hilmar's methods > suggestions/rules (as well as a few I found dating back a number of > years on the mail list). It might be worth a glance in case there > are any changes needed: > > http://www.bioperl.org/wiki/Advanced_BioPerl > > chris > > On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > >> Post your dualing screenshots to the wiki! >> >> I had started a couple of IDE pages on the wiki a while ago: >> http://bioperl.org/wiki/Emacs >> http://bioperl.org/wiki/Emacs_template >> http://bioperl.org/wiki/Vi >> >> If anyone is feeling excited enough to write a few more IDE pages >> and link them into a common article that would be great. >> >> -jason >> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: >> >>> >>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >>> >>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>>> don't get a bioperl menu item >>> >>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu >>> item >>> it showing up just beautifully. (BTW it also have very nice icons >>> for >>> various functions - though I always feel guilty for using keystrokes >>> instead.) >>> >>> Is GNU Emacs finally winning this? ;) >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 19:44:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:44:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu> Agreed. We can definitely add that in. As we edge towards another release we try another round of cleaning up. I wouldn't mind pushing out another 1.5 point release before summer's up if possible; most of the tough work was done for v.1.5.2 by Sendu. chris On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote: > Thanks Chris for doing this - looks great. The only comment that I > have is that method names should never start with a capital letter. > If the getter/setter is for a single object (as opposed to a list), > the name should probably be similar (if not identical) to the class > being expected and returned, but lower-case. > > E.g., $feature->location(), $seq->species() etc > > -hilmar > > On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > >> Would probably be worth writing one up for Komodo since Mauricio, >> Sendu, and I use it. >> >> I updated the Advanced BioPerl page with Hilmar's methods >> suggestions/rules (as well as a few I found dating back a number of >> years on the mail list). It might be worth a glance in case there >> are any changes needed: >> >> http://www.bioperl.org/wiki/Advanced_BioPerl >> >> chris ... From johncumbers at gmail.com Wed Jun 13 20:20:42 2007 From: johncumbers at gmail.com (John Cumbers) Date: Wed, 13 Jun 2007 20:20:42 -0400 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? Message-ID: Hello, I have a simple problem, I'm trying to search a genome sequence for a motif, I then want to output a BED file to display all the locations of this motif on the UCSC Genome Browser. I could not find a script to do this, so I started to write my own. I'm new to perl and my code below was my attempt to read the sequence string and output the index bp of the start of each motif. With this I could build the BED file myself, which requires start and finish base pairs. For the first motif I can output the start index, but when I try and read the next one off the sequence it does not work. Instead I just get an output of a list of 1's. I realise that this is more a request for some simple perl help, but any help much appreciated. Best wishes, John $seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta"); #turn my FASTA file into a seq object. $sequence_as_a_string = $seq_object->seq(); #turn it into a string # search $sequence_as_a_string string for motif AAA as example # if found, return the index that it is found at while ($sequence_as_a_string =~ m/AAA/g) { print "Found '$&'. Next attempt at character " . pos($sequence_as_a_string)+1 . "\n"; } -- John Cumbers, Graduate Student Biology and Medicine Brown University, Box G-W Providence, Rhode Island, 02912, USA Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 UK to USA: 0207 617 7824 From cjfields at uiuc.edu Wed Jun 13 21:58:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 20:58:37 -0500 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: References: Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> This is answered in the FAQ (sorry if the URL wraps, but we don't like tinyurls): http://www.bioperl.org/wiki/ FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F chris On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > Hello, > > I have a simple problem, I'm trying to search a genome sequence for > a motif, > I then want to output a BED file to display all the locations of > this motif > on the UCSC Genome Browser. I could not find a script to do this, > so I > started to write my own. I'm new to perl and my code below was my > attempt > to read the sequence string and output the index bp of the start of > each > motif. With this I could build the BED file myself, which requires > start > and finish base pairs. > > For the first motif I can output the start index, but when I try > and read > the next one off the sequence it does not work. Instead I just get an > output of a list of 1's. I realise that this is more a request for > some > simple perl help, but any help much appreciated. > > Best wishes, > John > > > $seq_object = read_sequence > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > my FASTA file into a seq object. > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > # search $sequence_as_a_string string for motif AAA as example > # if found, return the index that it is found at > > while ($sequence_as_a_string =~ m/AAA/g) { > print "Found '$&'. Next attempt at character " . > pos($sequence_as_a_string)+1 . "\n"; > } > > > > -- > John Cumbers, Graduate Student > Biology and Medicine > Brown University, Box G-W > Providence, Rhode Island, 02912, USA > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > UK to USA: 0207 617 7824 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Jun 14 00:08:04 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 21:08:04 -0700 Subject: [Bioperl-l] wiki bulk update Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org> I did a some bulk update of Module pages for new modules that had been created since we last setup these pages: I outlined a little bit of what it requires behind the scenes. http://bioperl.org/wiki/BioPerl:Module_pages -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From bix at sendu.me.uk Thu Jun 14 05:35:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 10:35:00 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() Message-ID: <46710BC4.3060302@sendu.me.uk> It is preferable to have ->new syntax over new Object syntax, as outlined here: http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules I propose making this syntax change in all Bioperl POD documentation, so that the bad syntax is no longer suggested/encouraged. Any objections? If not, I'll go ahead and commit the changes. (affects 907 modules in live) Cheers, Sendu. From bix at sendu.me.uk Thu Jun 14 06:01:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 11:01:02 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <467111DE.6060800@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > > I propose making this syntax change in all Bioperl POD documentation, Actually, I propose making the change to code as well. From hlapp at gmx.net Thu Jun 14 08:47:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 08:47:47 -0400 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net> Sounds fine to me. People do go by working examples, and I've seen inconsistent examples leading to confusion on the end of newbies. -hilmar On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Jun 14 08:55:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 07:55:18 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: Sounds fine by me. I may actually start tackling some of the feature/ annotation overloading stuff myself to see what happens (I'll drop a notice when that occurs). chris On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. From tanzeem.mb at gmail.com Thu Jun 14 02:27:19 2007 From: tanzeem.mb at gmail.com (tanzeem) Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT) Subject: [Bioperl-l] Problem working with remoteblast submit method in webbrowser. Message-ID: <11114623.post@talk.nabble.com> I have a program which uses the Bio perl remoteblast module which compares a aminoacid fasta file with swissprot database. The submit_blast() method works successfully when run from commandline.But when the program is run from web browser it returns -1. I was trying to adapt the code from Remoteblast synopsis for my need. -- View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bix at sendu.me.uk Thu Jun 14 11:34:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 16:34:27 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <46716003.2030302@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > I propose making this syntax change in all Bioperl POD documentation, so > that the bad syntax is no longer suggested/encouraged. Any objections? > If not, I'll go ahead and commit the changes. > > (affects 907 modules in live) It was actually 515 modules & test scripts from live, 48 from run, 21 from db and 2 from network. Now committed. Before and after my changes these were failing: Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioGraphics.t 3 768 38 3 3-5 t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 1932 2106 t/Sopma.t 2 512 16 2 8 15 t/genbank.t 2 512 247 2 122-123 BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 (unintentional?). Sopma may not be a bug: results from server might have changed. genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 -> 1.164 not doing what the new tests expect. PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are you working on that, or can I fix those errors? Anyone care to look into those things? Cheers, Sendu. From cjfields at uiuc.edu Thu Jun 14 12:35:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 11:35:21 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: The genbank commit was mine so I'll look into it; may be that I hadn't finished up the bug work. If if have time I'll look into Sopma as well (unless you get to it first). chris On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD >> documentation, so >> that the bad syntax is no longer suggested/encouraged. Any >> objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ---------------------------------------------------------------------- > --------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm > 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, > are > you working on that, or can I fix those errors? > > Anyone care to look into those things? > > Cheers, > Sendu. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Thu Jun 14 12:43:43 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:43:43 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <4671703F.4010109@sheffield.ac.uk> I'm just wondering if anyone passes their modules through perltidy in order for them to have the same look/feel? If so, do you have a .perltidyrc file? Also, is it worth running the Bioperl modules through it? Nath From n.haigh at sheffield.ac.uk Thu Jun 14 12:36:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:36:37 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <46716E95.3090604@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD documentation, so >> that the bad syntax is no longer suggested/encouraged. Any objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ------------------------------------------------------------------------------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are > you working on that, or can I fix those errors? > I can fix these - although I'm still trying to get my new Debian 4.0 system up-to-speed so it might take me a little while! RE the PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't installed. However, would it be better to have Test::Pod in t/lib so that it runs on the user's system during installation or leave it as is? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS 7olroF2e6+4I0biz6fWRmu4= =s3hK -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 14 13:15:24 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:15:24 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <4671703F.4010109@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> Message-ID: <467177AC.8060104@sendu.me.uk> Nathan S. Haigh wrote: > I'm just wondering if anyone passes their modules through perltidy in > order for them to have the same look/feel? If so, do you have a > .perltidyrc file? Also, is it worth running the Bioperl modules through it? I don't use it, but I was contemplating the same thing. Chris uses it from time to time and I think we have a similar taste in style. But we'd have to hammer something out that was agreeable to everyone. From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 13:19:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:19:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz> David Messina wrote: > Hi Martin, > > You're in luck -- the BioPerl core distribution includes two scripts > for doing just that: > > genbank2gff Somehow these scripts were not installed for me on Gentoo, but I have then in the cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database, or better to say I have no intent to install that unknown thing, seems like an overkill for my case. I just want to render a plasmid map. > genbank2gff3 This one seems more promising but still with current cvs checkout I get... $ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb # Input: stdin Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, line 125. $ $ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. ID unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP. XX AC unknown; XX XX XX CC ApEinfo:methylated:0 ... Oh dear, I have just manually edited the files and still they are wrong? Oh no. :( > > Look in the scripts directory of the distro. > > Also, there is a *huge* amount of documentation and examples on the > BioPerl website. > > http://www.bioperl.org/wiki/HOWTOs You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-) > > Reading those, reading the FAQ, and searching the mailing list > archives are where I look first when I don't know how to do something > in BioPerl. > > > Dave > > -- > Dave Messina > Senior Analyst, Assembly Group > Genome Sequencing Center > Washington University > St. Louis, MO > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 99.gb URL: From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 13:23:28 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:23:28 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> Message-ID: <46717990.6040509@ribosome.natur.cuni.cz> Martin MOKREJ? wrote: >> Also, there is a *huge* amount of documentation and examples on the >> BioPerl website. >> >> http://www.bioperl.org/wiki/HOWTOs > > You mean > http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File > ? ;-) $ perl embl2picture.pl ~/99.gb | display - Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. $ The plasmid is a circular DNA, why is the diagram in linear? ;-) Martin From bix at sendu.me.uk Thu Jun 14 13:03:34 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:03:34 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716E95.3090604@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <46716E95.3090604@sheffield.ac.uk> Message-ID: <467174E6.1090001@sendu.me.uk> Nathan S. Haigh wrote: >> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are >> you working on that, or can I fix those errors? > > I can fix these - although I'm still trying to get my new Debian 4.0 > system up-to-speed so it might take me a little while! RE the > PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't > installed. However, would it be better to have Test::Pod in t/lib so > that it runs on the user's system during installation or leave it as is? Leave it as is. Every-day users don't need to check the syntax of the pod. In fact, it really only needs to be done once, prior to packaging up a new release. From n.haigh at sheffield.ac.uk Thu Jun 14 13:32:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:32:37 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46717BB5.8000706@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> I'm just wondering if anyone passes their modules through perltidy in >> order for them to have the same look/feel? If so, do you have a >> .perltidyrc file? Also, is it worth running the Bioperl modules >> through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. A starting place maybe Perl Best Practices by Damian Conway: http://www.oreilly.com/catalog/perlbp/ The perltidyrc file can e found here: http://www.perlmonks.org/?node_id=485885 I also found this nice thread with some ideas, inc some code that causes emacs to auto-perltidy everything you use cperl-mode with. I don't use emacs myself, ut here's the link if anyone is interested: http://www.perlmonks.org/?node_id=516501 Nath From johnsonm at gmail.com Thu Jun 14 13:38:31 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 12:38:31 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: The nice thing about Perl Tidy is that everybody can have their own config file. There could be a bioperl default config that gets applied at checkin time. Anybody that didn't like it could script checkouts to get run through their own config. Diffs might get a little hairy, but as long as you tidy before diffing, it shouldn't be too bad. Speaking of which....coding style is controversial enough, but since that's already been opened, what about CVS vs Subversion? 8) Some of the scripting for this sort of thing might be easer in Subversion. Though maybe something like Git would fit the developer model better (more support for distributed development). On 6/14/07, Sendu Bala wrote: > Nathan S. Haigh wrote: > > I'm just wondering if anyone passes their modules through perltidy in > > order for them to have the same look/feel? If so, do you have a > > .perltidyrc file? Also, is it worth running the Bioperl modules through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Thu Jun 14 13:39:39 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:39:39 +0100 Subject: [Bioperl-l] cvs changes in working copy Message-ID: <46717D5B.5040108@sheffield.ac.uk> Not sure if I'm being dense or if it's because I've been working with svn recently, but - how do I get a list of files that are different in my working copy compared to the repository? Cheers Nath From cjfields at uiuc.edu Thu Jun 14 13:46:38 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 12:46:38 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: Is 99.gb supposed to be a GenBank file? And you're loading it into embl2picture (which I assume takes EMBL format files)? Without example code we can easily make the wrong assumptions (i.e. that this is user error and not a BioPerl problem). Also, I don't believe the feature plotting scripts plot circular chromosomes/plasmids. If you want this functionality you'll have to code it for yourself. chris On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > Martin MOKREJ? wrote: > >>> Also, there is a *huge* amount of documentation and examples on the >>> BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> >> You mean >> http://www.bioperl.org/wiki/ >> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature > Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature > Bio::Location::Simple=HASH(0x893ebac): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature > Bio::Location::Simple=HASH(0x893e720): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature > Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > $ > > The plasmid is a circular DNA, why is the diagram in linear? ;-) > > Martin > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Jun 14 13:57:35 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 12:57:35 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <46717BB5.8000706@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk> Message-ID: <4671818F.5040902@campus.iztacala.unam.mx> I think a consensus .perltidyrc could be placed in the source distribution. Mauricio. Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. > > A starting place maybe Perl Best Practices by Damian Conway: > http://www.oreilly.com/catalog/perlbp/ > > > The perltidyrc file can e found here: > http://www.perlmonks.org/?node_id=485885 > > I also found this nice thread with some ideas, inc some code that causes > emacs to auto-perltidy everything you use cperl-mode with. I don't use > emacs myself, ut here's the link if anyone is interested: > http://www.perlmonks.org/?node_id=516501 > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Thu Jun 14 14:32:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 13:32:41 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: To chip in on this, I only use perltidy when I need to clean bioperl code up for debugging (particularly if blocks are hard to see) and just use the defaults. I agree it would be nice to have everything tidied up but it'll definitely need to be a consensus config file. About svn, I like the idea of eventually migrating to using it over CVS (I think BioPython and BioJava have plans to but I'm not sure) but I don't really know enough to say how feasible/difficult the migration path would be. Anyone know? chris On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > The nice thing about Perl Tidy is that everybody can have their > own config file. There could be a bioperl default config that gets > applied at checkin time. Anybody that didn't like it could script > checkouts to get run through their own config. Diffs might get a > little hairy, but as long as you tidy before diffing, it shouldn't be > too bad. Speaking of which....coding style is controversial enough, > but since that's already been opened, what about CVS vs Subversion? 8) > Some of the scripting for this sort of thing might be easer in > Subversion. Though maybe something like Git would fit the developer > model better (more support for distributed development). > > On 6/14/07, Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through >>> perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnsonm at gmail.com Thu Jun 14 14:46:24 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 13:46:24 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: If there was a default/standard/consensus bioperl perltidy config file, I would probably use it prior to checkin, on my own, so I could code in my schizophrenic style without worrying about starting any format wars. When I'm fixing or enhancing somebody else's code, I always try and adapt to whatever style they used, even if it grates on my nerves. I'd love to not have to worry about that with Bioperl. Of course, nobody will every agree on a standard, so it's probably a moot point. 8) On 6/14/07, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > > chris From jason at bioperl.org Thu Jun 14 15:00:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:00:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > Can we do any sort of massive conversion at some logical timepoint. Probably after a branch release or something? Because it basically means we're going to have differences on nearly every line which is going to make diff-ing difficult when debugging old/new versions. Maybe it is not a problem because we aren't introducing and new bugs! > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > It's doable but non-trivial. cvs2svn (python gah!) script exists to help in this. There are pros and cons to converting. There is a fair amount of documentation and other pointers out there that point to the CVS server for getting latest code so we'd need to think about whether we'd support some sort of backwards compatible SVN -> CVS for read-only or what. Mostly it will need someone to lead the charge - I made a go at doing it in the winter, but I really don't have the SVN-foo to make this work. We'd need someone with SVN experience to step up and help. You can always try and we can play with the converted repository for a while without making it the new code base. -j > chris > > On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > >> The nice thing about Perl Tidy is that everybody can have their >> own config file. There could be a bioperl default config that gets >> applied at checkin time. Anybody that didn't like it could script >> checkouts to get run through their own config. Diffs might get a >> little hairy, but as long as you tidy before diffing, it shouldn't be >> too bad. Speaking of which....coding style is controversial enough, >> but since that's already been opened, what about CVS vs >> Subversion? 8) >> Some of the scripting for this sort of thing might be easer in >> Subversion. Though maybe something like Git would fit the developer >> model better (more support for distributed development). >> >> On 6/14/07, Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> I'm just wondering if anyone passes their modules through >>>> perltidy in >>>> order for them to have the same look/feel? If so, do you have a >>>> .perltidyrc file? Also, is it worth running the Bioperl modules >>>> through it? >>> >>> I don't use it, but I was contemplating the same thing. Chris >>> uses it >>> from time to time and I think we have a similar taste in style. >>> >>> But we'd have to hammer something out that was agreeable to >>> everyone. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Thu Jun 14 15:01:27 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:01:27 -0700 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: <46717D5B.5040108@sheffield.ac.uk> References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: cvs update | grep '^M' On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > Not sure if I'm being dense or if it's because I've been working with > svn recently, but - how do I get a list of files that are different in > my working copy compared to the repository? > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Thu Jun 14 15:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 14:20:46 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > > On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > >> To chip in on this, I only use perltidy when I need to clean bioperl >> code up for debugging (particularly if blocks are hard to see) and >> just use the defaults. I agree it would be nice to have everything >> tidied up but it'll definitely need to be a consensus config file. >> > > Can we do any sort of massive conversion at some logical timepoint. > Probably after a branch release or something? Because it basically > means we're going to have differences on nearly every line which is > going to make diff-ing difficult when debugging old/new versions. > Maybe it is not a problem because we aren't introducing and new bugs! I agree; if we intend on doing this it should be all at once, maybe on a branch dedicated to ensure that code changes don't tank tests (they shouldn't but one never knows). We would then need a script up- and-running that tidies everything up prior to commits (though what happens if perltidy tanks?...). Sendu, up for it? >> About svn, I like the idea of eventually migrating to using it over >> CVS (I think BioPython and BioJava have plans to but I'm not sure) >> but I don't really know enough to say how feasible/difficult the >> migration path would be. Anyone know? >> > > It's doable but non-trivial. cvs2svn (python gah!) script exists to > help in this. There are pros and cons to converting. There is a > fair amount of documentation and other pointers out there that point > to the CVS server for getting latest code so we'd need to think about > whether we'd support some sort of backwards compatible SVN -> CVS for > read-only or what. > > Mostly it will need someone to lead the charge - I made a go at doing > it in the winter, but I really don't have the SVN-foo to make this > work. We'd need someone with SVN experience to step up and help. > You can always try and we can play with the converted repository for > a while without making it the new code base. > > -j Stepped into that one, didn't I! I'll look into how much effort is involved and try getting something going in the next month or two, maybe sooner if time permits. I'm lacking on SVN-foo as well but it might be worth looking into. chris From arareko at campus.iztacala.unam.mx Thu Jun 14 15:50:39 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 14:50:39 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> About svn, I like the idea of eventually migrating to using it over >>> CVS (I think BioPython and BioJava have plans to but I'm not sure) >>> but I don't really know enough to say how feasible/difficult the >>> migration path would be. Anyone know? >>> >> It's doable but non-trivial. cvs2svn (python gah!) script exists to >> help in this. There are pros and cons to converting. There is a >> fair amount of documentation and other pointers out there that point >> to the CVS server for getting latest code so we'd need to think about >> whether we'd support some sort of backwards compatible SVN -> CVS for >> read-only or what. >> >> Mostly it will need someone to lead the charge - I made a go at doing >> it in the winter, but I really don't have the SVN-foo to make this >> work. We'd need someone with SVN experience to step up and help. >> You can always try and we can play with the converted repository for >> a while without making it the new code base. >> >> -j > > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. > > chris > Chris D has worked with CVS-SVN transitioning for other projects, maybe he can shed some light on this. Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From sac at bioperl.org Thu Jun 14 17:33:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Thu, 14 Jun 2007 14:33:39 -0700 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> References: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com> This issue was discussed recently here. Check out this thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048 Some of the tools mentioned in the FAQ item Chris mentioned do not report where the match occurred, only that a match occurred (String::Approx, agrep), though some do report do report match locations (fuzznuc, fuzzprot; not sure about TFBS). My Bio::Tools::SeqPattern module does not even perform any matches, it just encapsulates a regular expression for a nuc or protein motif and knows how to handle ambiguity code expansion and reverse complementing. The idea is that you can use this to convert a biological sequence motif into a string suitable for use in a perl regex. Adding a match() method to this module would be handy. There an example script for it in examples/tools of the distro (which, btw references an obsolete module, so it won't run as is -- I'll fix). Steve On 6/13/07, Chris Fields wrote: > This is answered in the FAQ (sorry if the URL wraps, but we don't > like tinyurls): > > http://www.bioperl.org/wiki/ > FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. > 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F > > chris > > On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > > > Hello, > > > > I have a simple problem, I'm trying to search a genome sequence for > > a motif, > > I then want to output a BED file to display all the locations of > > this motif > > on the UCSC Genome Browser. I could not find a script to do this, > > so I > > started to write my own. I'm new to perl and my code below was my > > attempt > > to read the sequence string and output the index bp of the start of > > each > > motif. With this I could build the BED file myself, which requires > > start > > and finish base pairs. > > > > For the first motif I can output the start index, but when I try > > and read > > the next one off the sequence it does not work. Instead I just get an > > output of a list of 1's. I realise that this is more a request for > > some > > simple perl help, but any help much appreciated. > > > > Best wishes, > > John > > > > > > $seq_object = read_sequence > > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > > my FASTA file into a seq object. > > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > > # search $sequence_as_a_string string for motif AAA as example > > # if found, return the index that it is found at > > > > while ($sequence_as_a_string =~ m/AAA/g) { > > print "Found '$&'. Next attempt at character " . > > pos($sequence_as_a_string)+1 . "\n"; > > } > > > > > > > > -- > > John Cumbers, Graduate Student > > Biology and Medicine > > Brown University, Box G-W > > Providence, Rhode Island, 02912, USA > > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > > UK to USA: 0207 617 7824 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Thu Jun 14 19:04:11 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 19:04:11 -0400 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net> Actually, that will update your repository. If you just wanted to take a peek you would use cvs status: $ cvs status | grep 'Locally Modified' -hilmar On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote: > cvs update | grep '^M' > > On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > >> Not sure if I'm being dense or if it's because I've been working with >> svn recently, but - how do I get a list of files that are >> different in >> my working copy compared to the repository? >> >> Cheers >> Nath >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From mmokrejs at ribosome.natur.cuni.cz Fri Jun 15 03:28:17 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 15 Jun 2007 09:28:17 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <46723F91.60501@ribosome.natur.cuni.cz> Chris Fields wrote: > Is 99.gb supposed to be a GenBank file? And you're loading it into Yes, it was attached to the email. ;) > embl2picture (which I assume takes EMBL format files)? Without example > code we can easily make the wrong assumptions (i.e. that this is user > error and not a BioPerl problem). use constant USAGE =>< Render a GenBank/EMBL entry into drawable form. Return as a GIF or PNG image on standard output. File must be in embl, genbank, or another SeqIO- recognized format. Only the first entry will be rendered. Example to try: embl2picture.pl factor7.embl | display - END > > Also, I don't believe the feature plotting scripts plot circular > chromosomes/plasmids. If you want this functionality you'll have to > code it for yourself. That's a pitty it does not, but at least if someone could improve the docs. ;) Unfortunately I don't have the time to rewrite the code myself now, I need a working, standalone, already available tool. :( M. > > chris > > On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > >> Martin MOKREJ? wrote: >> >>>> Also, there is a *huge* amount of documentation and examples on the >>>> BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> >>> ? ;-) >> >> $ perl embl2picture.pl ~/99.gb | display - >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature >> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature >> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature >> Bio::Location::Simple=HASH(0x893e720): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature >> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> $ >> >> The plasmid is a circular DNA, why is the diagram in linear? ;-) >> >> Martin >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From dhoworth at mrc-lmb.cam.ac.uk Fri Jun 15 04:59:09 2007 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Fri, 15 Jun 2007 09:59:09 +0100 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk> Martin MOKREJ? wrote: >>> Also, there is a *huge* amount of documentation and examples on >>> the BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> You mean >> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - Error returned while > evaluating value of 'description' option for glyph > Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. Hmm an error at line 141 of a 69 line script? Methinks you're not actually running the script that's presented on the wiki page you quoted. I cut-and-pasted the script and your file and it worked for me (at least, it produced an image, along with a bunch of OOPS lines) HTH, Dave From n.haigh at sheffield.ac.uk Fri Jun 15 06:21:38 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:21:38 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726832.7080601@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a VPt4tEPLW2J+BiKnN3B8aV8= =c+9z -----END PGP SIGNATURE----- From bix at sendu.me.uk Fri Jun 15 06:07:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:07:04 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <467264C8.4020202@sendu.me.uk> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> To chip in on this, I only use perltidy when I need to clean bioperl >>> code up for debugging (particularly if blocks are hard to see) and >>> just use the defaults. I agree it would be nice to have everything >>> tidied up but it'll definitely need to be a consensus config file. >>> >> Can we do any sort of massive conversion at some logical timepoint. >> Probably after a branch release or something? Because it basically >> means we're going to have differences on nearly every line which is >> going to make diff-ing difficult when debugging old/new versions. >> Maybe it is not a problem because we aren't introducing and new bugs! Sorry, can you clarify the problem you envisage? And why would making a branch release help? > I agree; if we intend on doing this it should be all at once, maybe > on a branch dedicated to ensure that code changes don't tank tests > (they shouldn't but one never knows). We would then need a script up- > and-running that tidies everything up prior to commits (though what > happens if perltidy tanks?...). > > Sendu, up for it? If its going to be difficult and a hassle, for such an unnecessary thing I'm not sure its worth it. There are more pressing things to be done for Bioperl. If I can just run perltidy on the entire package and commit, I'd do it. If that's not appropriate, I won't. >>> About svn [snip] > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. I'd put this in the unnecessary-but-nice category as well. If it will be as easy as my ->new change, go ahead. If not, there are more pressing matters (POD fixing, test script updating and finishing...). From n.haigh at sheffield.ac.uk Fri Jun 15 06:35:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:35:40 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726B7C.7070902@sheffield.ac.uk> I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath From bix at sendu.me.uk Fri Jun 15 06:45:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:45:48 +0100 Subject: [Bioperl-l] Installation using --install_base In-Reply-To: <46726832.7080601@sheffield.ac.uk> References: <46726832.7080601@sheffield.ac.uk> Message-ID: <46726DDC.8090202@sendu.me.uk> Nathan S. Haigh wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I'm setting up a new installation of Debian 4.0 at home and though I'd > try to install BioPerl as a normal user rather than root. So in CPAN > options I set the --install_base to /home/username/perl and set PERL5LIB > to point to the same place. > > Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root > user and ask to install all optional modules, it tries to install them > through CPAN - however it seems to fail because some dependencies don't > seem to want to install in a user directory. > > Has anyone else found this or might I be doing something wrong? You'll need to configure CPAN to install into your user directory. Upgrade to the latest version, then go read the docs on the various configurable options. I thought I at least mentioned this in the Bioperl INSTALL doc. If not, can someone come up with a concise clarification? From sdavis2 at mail.nih.gov Fri Jun 15 06:56:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 06:56:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <46727048.3080904@mail.nih.gov> Sendu Bala wrote: > If its going to be difficult and a hassle, for such an unnecessary thing > I'm not sure its worth it. There are more pressing things to be done for > Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do it. > If that's not appropriate, I won't. I agree with the sentiment noted above. I'm a bit of an outsider here, but bioperl is a collaborative project. Not everyone has the same sentiments about what "correct" style means. As a programmer, I really wouldn't want significant changes on the style of my code. And perl happily puts up with many styles. I would say leave things as they are--let the individual programmers choose. It reduces the amount of work of questionable importance and allows the coding style freedom that perl supports. Just my $.02. Sean From cjfields at uiuc.edu Fri Jun 15 10:05:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:05:07 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> Message-ID: On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote: > Chris Fields wrote: >> Is 99.gb supposed to be a GenBank file? And you're loading it into > > Yes, it was attached to the email. ;) Sorry about that. I notice that '.' was added, but the spacing seemed off. I think bioperl catches that fine but it's something Wayne should consider. >> embl2picture (which I assume takes EMBL format files)? Without >> example >> code we can easily make the wrong assumptions (i.e. that this is user >> error and not a BioPerl problem). > > use constant USAGE =>< Usage: $0 > Render a GenBank/EMBL entry into drawable form. > Return as a GIF or PNG image on standard output. > > File must be in embl, genbank, or another SeqIO- > recognized format. Only the first entry will be > rendered. > > Example to try: > embl2picture.pl factor7.embl | display - > > END Horribly named script (should be seq2picture, since it converts both gb/embl). The use of 'all_tags' makes me think the script version you are using is old, as those methods have long since been renamed. Dave has it working though, so maybe your version has been updated? The 'use of initialized data in' errors are probably from inclusion of mandatory fields with no data or '.'. >> Also, I don't believe the feature plotting scripts plot circular >> chromosomes/plasmids. If you want this functionality you'll have to >> code it for yourself. > > That's a pitty it does not, but at least if someone could improve > the docs. ;) > Unfortunately I don't have the time to rewrite the code myself now, > I need a working, standalone, already available tool. :( > M. As I said, unless someone shows interest and codes it just won't get done. We have had very little interest in this, either b/c there are tools already out there to do this very thing (multitudes of plasmid drawing programs, some free like ApE) or that nobody's bothered to write it up. chris From cjfields at uiuc.edu Fri Jun 15 10:22:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:22:23 -0500 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <46727048.3080904@mail.nih.gov> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing >> I'm not sure its worth it. There are more pressing things to be >> done for >> Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd >> do it. >> If that's not appropriate, I won't. > > I agree with the sentiment noted above. I'm a bit of an outsider > here, > but bioperl is a collaborative project. Not everyone has the same > sentiments about what "correct" style means. As a programmer, I > really > wouldn't want significant changes on the style of my code. And perl > happily puts up with many styles. I would say leave things as they > are--let the individual programmers choose. It reduces the amount of > work of questionable importance and allows the coding style freedom > that > perl supports. > > Just my $.02. > > Sean I tend to run it on modules that need some reformatting (SearchIO::blast comes to mind). I believe you're correct when this comes down to programming style, but I think this echoes a sentiment (frustration, perhaps) that some of us have with long-term maintenance of said code. Maybe a compromise: include a copy of .perltidyrc with the distribution that goes by what a consensus wants or by the general rules laid out in Perl Best Practices (spaced settings, use of spaces over tabs, etc). Conversion would be encouraged but voluntary, with the caveat that if someone needs to clean up code down the road (bug fixes, enhancements, etc) and if the original author isn't able to add it in themselves, it could be perltidy'd in order to help the developer (locate and fix the issue)|(add relevant enhancement where needed). chris From cjfields at uiuc.edu Fri Jun 15 10:56:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:56:23 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> ... >>> Can we do any sort of massive conversion at some logical timepoint. >>> Probably after a branch release or something? Because it basically >>> means we're going to have differences on nearly every line which is >>> going to make diff-ing difficult when debugging old/new versions. >>> Maybe it is not a problem because we aren't introducing and new >>> bugs! > > Sorry, can you clarify the problem you envisage? And why would > making a branch release help? Maybe the worry is that mass conversion in such a large codebase could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o trying? >> I agree; if we intend on doing this it should be all at once, >> maybe on a branch dedicated to ensure that code changes don't >> tank tests (they shouldn't but one never knows). We would then >> need a script up- and-running that tidies everything up prior to >> commits (though what happens if perltidy tanks?...). >> Sendu, up for it? > > If its going to be difficult and a hassle, for such an unnecessary > thing I'm not sure its worth it. There are more pressing things to > be done for Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do > it. If that's not appropriate, I won't. The choices aren't necessarily all or nothing. What about voluntary, recommended use of a perltidy config file included with the distribution, with additional 'caveats'? See my response to Sean. >>>> About svn > [snip] >> Stepped into that one, didn't I! I'll look into how much effort >> is involved and try getting something going in the next month or >> two, maybe sooner if time permits. I'm lacking on SVN-foo as >> well but it might be worth looking into. > > I'd put this in the unnecessary-but-nice category as well. If it > will be as easy as my ->new change, go ahead. If not, there are > more pressing matters (POD fixing, test script updating and > finishing...). A few other open-bio projects have actively discussed a CVS->SVN migration (BioRuby and I think BioPython, though the latter could be wrong). As I said, "it might be worth looking into" to weigh the pros/cons, get others opinions from others who have made the transition, etc. We could, as Jason suggested, even set up a tester SVN w/o making it the default codebase (lock it off to a few testers, have CVS commits automatically/manually carry over to SVN, etc). I agree with you that it's not feasible to switch over prior to a release and that there are more pressing issues, but it doesn't hurt having an open discussion about it. chris From sdavis2 at mail.nih.gov Fri Jun 15 11:15:57 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 11:15:57 -0400 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD2D.2090001@mail.nih.gov> Chris Fields wrote: > > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary thing >>> I'm not sure its worth it. There are more pressing things to be done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do it. >>> If that's not appropriate, I won't. >> >> I agree with the sentiment noted above. I'm a bit of an outsider here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting (SearchIO::blast > comes to mind). I believe you're correct when this comes down to > programming style, but I think this echoes a sentiment (frustration, > perhaps) that some of us have with long-term maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the distribution > that goes by what a consensus wants or by the general rules laid out in > Perl Best Practices (spaced settings, use of spaces over tabs, etc). > Conversion would be encouraged but voluntary, with the caveat that if > someone needs to clean up code down the road (bug fixes, enhancements, > etc) and if the original author isn't able to add it in themselves, it > could be perltidy'd in order to help the developer (locate and fix the > issue)|(add relevant enhancement where needed). Don't get me wrong--I think whatever makes bioperl a better, more maintainable beast should be what is done. The bioperl gurus should absolutely do what is best for them for code maintainability. Sean From n.haigh at sheffield.ac.uk Fri Jun 15 11:17:15 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 16:17:15 +0100 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD7B.4050109@sheffield.ac.uk> Chris Fields wrote: > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing >>> I'm not sure its worth it. There are more pressing things to be >>> done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd >>> do it. >>> If that's not appropriate, I won't. >> I agree with the sentiment noted above. I'm a bit of an outsider >> here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I >> really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom >> that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting > (SearchIO::blast comes to mind). I believe you're correct when this > comes down to programming style, but I think this echoes a sentiment > (frustration, perhaps) that some of us have with long-term > maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the > distribution that goes by what a consensus wants or by the general > rules laid out in Perl Best Practices (spaced settings, use of spaces > over tabs, etc). RE spaces, tabs etc - how well is the different coding styles handled for displaying in html and via the online browsable cvs? Conversion would be encouraged but voluntary, with > the caveat that if someone needs to clean up code down the road (bug > fixes, enhancements, etc) and if the original author isn't able to > add it in themselves, it could be perltidy'd in order to help the > developer (locate and fix the issue)|(add relevant enhancement where > needed). > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnsonm at gmail.com Fri Jun 15 15:37:26 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Fri, 15 Jun 2007 14:37:26 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: Patches waiting in Bugzilla (Bug #2299). Changes: -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for prokaryotic reports (Glimmer2/Glimmer3) -Bio::Tools::Glimmer now produces features with Fuzzy or Split locations as appropriate (partial or circular/wraparound predictions) -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out sequence lengths -Bio::Tools::Run::Glimmer passes along the sequence length to Bio::Tools::Glimmer for Glimmer2 I should probably modify Bio::Tools::Genemark to use Bio::SeqFeature::Generic features for prokaryotic reports, to be consistent, but this is more likely to surprise people. If nobody screams about the change to Bio::Tools::Glimmer, I'll do it at some point. On 5/21/07, Chris Fields wrote: > > On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: > > >> glimmer2/3 both assume the genome is circular by default (I'm > >> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to > >> the Glimmer3 release notes the detail file has the information in the > >> header; from the Glimmer3 data used for tests: > > > > You beat me to the reply Chris - yes, Glimmer2/3 assume circular > > chromosome by default. I had forgotten about this in earlier > > discussions of the new Glimmer parsers as I normally run it in > > --linear / -L mode (even if I know it is circular) because it is > > easier to handle, and our sequencer/assembler team usually gets the > > origin of replication right. > > > >> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA > >> Glimmer3.icm Glimmer3 > > > > I did a double-take here - that's the path to my Glimmer3 > > installation! It took me a couple of minutes to realise that you got > > it from the bioperl test data I created. D'oh! :-) > > Yep, I forgot about that! > > >> There are options available for glimmer3 (-L, -X) that specify a > >> linear sequence or allow ORFs to extend past the end of the sequence > >> analyzed (the latter assumes a linear sequence). > > > > If the -L mode should produce Bio::Location::Split objects, I guess if > > -X is used > > it should produce Bio::Location::Fuzzy objects too... > > > > --Torsten > > True, didn't think about that one. Def. something to consider adding > in. > > chris > > > From cjfields at uiuc.edu Fri Jun 15 16:55:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 15:55:06 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: I'll try getting to that in tonight. Been pretty tied up lately... chris On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote: > Patches waiting in Bugzilla (Bug #2299). Changes: > > -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for > prokaryotic reports (Glimmer2/Glimmer3) > -Bio::Tools::Glimmer now produces features with Fuzzy or Split > locations as appropriate (partial or circular/wraparound predictions) > -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out > sequence lengths > -Bio::Tools::Run::Glimmer passes along the sequence length to > Bio::Tools::Glimmer for Glimmer2 > > I should probably modify Bio::Tools::Genemark to use > Bio::SeqFeature::Generic features for prokaryotic reports, to be > consistent, but this is more likely to surprise people. If nobody > screams about the change to Bio::Tools::Glimmer, I'll do it at some > point. > > On 5/21/07, Chris Fields wrote: >> >> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: >> >>>> glimmer2/3 both assume the genome is circular by default (I'm >>>> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to >>>> the Glimmer3 release notes the detail file has the information >>>> in the >>>> header; from the Glimmer3 data used for tests: >>> >>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular >>> chromosome by default. I had forgotten about this in earlier >>> discussions of the new Glimmer parsers as I normally run it in >>> --linear / -L mode (even if I know it is circular) because it is >>> easier to handle, and our sequencer/assembler team usually gets the >>> origin of replication right. >>> >>>> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ >>>> BCTDNA >>>> Glimmer3.icm Glimmer3 >>> >>> I did a double-take here - that's the path to my Glimmer3 >>> installation! It took me a couple of minutes to realise that you got >>> it from the bioperl test data I created. D'oh! :-) >> >> Yep, I forgot about that! >> >>>> There are options available for glimmer3 (-L, -X) that specify a >>>> linear sequence or allow ORFs to extend past the end of the >>>> sequence >>>> analyzed (the latter assumes a linear sequence). >>> >>> If the -L mode should produce Bio::Location::Split objects, I >>> guess if >>> -X is used >>> it should produce Bio::Location::Fuzzy objects too... >>> >>> --Torsten >> >> True, didn't think about that one. Def. something to consider adding >> in. >> >> chris >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rvos at interchange.ubc.ca Fri Jun 15 17:08:17 2007 From: rvos at interchange.ubc.ca (rvos) Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Hi, I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. Rutger -----Original Message----- > Date: Fri Jun 15 07:56:23 PDT 2007 > From: "Chris Fields" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sendu Bala" > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > >>>> ... > >>> Can we do any sort of massive conversion at some logical timepoint. > >>> Probably after a branch release or something? Because it basically > >>> means we're going to have differences on nearly every line which is > >>> going to make diff-ing difficult when debugging old/new versions. > >>> Maybe it is not a problem because we aren't introducing and new > >>> bugs! > > > > Sorry, can you clarify the problem you envisage? And why would > > making a branch release help? > > Maybe the worry is that mass conversion in such a large codebase > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > trying? > > >> I agree; if we intend on doing this it should be all at once, > >> maybe on a branch dedicated to ensure that code changes don't > >> tank tests (they shouldn't but one never knows). We would then > >> need a script up- and-running that tidies everything up prior to > >> commits (though what happens if perltidy tanks?...). > >> Sendu, up for it? > > > > If its going to be difficult and a hassle, for such an unnecessary > > thing I'm not sure its worth it. There are more pressing things to > > be done for Bioperl. > > > > If I can just run perltidy on the entire package and commit, I'd do > > it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? See my response to Sean. > > >>>> About svn > > [snip] > >> Stepped into that one, didn't I! I'll look into how much effort > >> is involved and try getting something going in the next month or > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >> well but it might be worth looking into. > > > > I'd put this in the unnecessary-but-nice category as well. If it > > will be as easy as my ->new change, go ahead. If not, there are > > more pressing matters (POD fixing, test script updating and > > finishing...). > > A few other open-bio projects have actively discussed a CVS->SVN > migration (BioRuby and I think BioPython, though the latter could be > wrong). As I said, "it might be worth looking into" to weigh the > pros/cons, get others opinions from others who have made the > transition, etc. We could, as Jason suggested, even set up a tester > SVN w/o making it the default codebase (lock it off to a few testers, > have CVS commits automatically/manually carry over to SVN, etc). > > I agree with you that it's not feasible to switch over prior to a > release and that there are more pressing issues, but it doesn't hurt > having an open discussion about it. > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From spiros at lokku.com Fri Jun 15 17:40:32 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Fri, 15 Jun 2007 22:40:32 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: On 6/15/07, rvos wrote: > Hi, > > I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. > > Rutger > I second that, SVN seems like the reasonable choice. I would be more than happy to help out as well. Spiros > > -----Original Message----- > > > Date: Fri Jun 15 07:56:23 PDT 2007 > > From: "Chris Fields" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sendu Bala" > > > > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > > > >>>> ... > > >>> Can we do any sort of massive conversion at some logical timepoint. > > >>> Probably after a branch release or something? Because it basically > > >>> means we're going to have differences on nearly every line which is > > >>> going to make diff-ing difficult when debugging old/new versions. > > >>> Maybe it is not a problem because we aren't introducing and new > > >>> bugs! > > > > > > Sorry, can you clarify the problem you envisage? And why would > > > making a branch release help? > > > > Maybe the worry is that mass conversion in such a large codebase > > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > > trying? > > > > >> I agree; if we intend on doing this it should be all at once, > > >> maybe on a branch dedicated to ensure that code changes don't > > >> tank tests (they shouldn't but one never knows). We would then > > >> need a script up- and-running that tidies everything up prior to > > >> commits (though what happens if perltidy tanks?...). > > >> Sendu, up for it? > > > > > > If its going to be difficult and a hassle, for such an unnecessary > > > thing I'm not sure its worth it. There are more pressing things to > > > be done for Bioperl. > > > > > > If I can just run perltidy on the entire package and commit, I'd do > > > it. If that's not appropriate, I won't. > > > > The choices aren't necessarily all or nothing. What about voluntary, > > recommended use of a perltidy config file included with the > > distribution, with additional 'caveats'? See my response to Sean. > > > > >>>> About svn > > > [snip] > > >> Stepped into that one, didn't I! I'll look into how much effort > > >> is involved and try getting something going in the next month or > > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > > >> well but it might be worth looking into. > > > > > > I'd put this in the unnecessary-but-nice category as well. If it > > > will be as easy as my ->new change, go ahead. If not, there are > > > more pressing matters (POD fixing, test script updating and > > > finishing...). > > > > A few other open-bio projects have actively discussed a CVS->SVN > > migration (BioRuby and I think BioPython, though the latter could be > > wrong). As I said, "it might be worth looking into" to weigh the > > pros/cons, get others opinions from others who have made the > > transition, etc. We could, as Jason suggested, even set up a tester > > SVN w/o making it the default codebase (lock it off to a few testers, > > have CVS commits automatically/manually carry over to SVN, etc). > > > > I agree with you that it's not feasible to switch over prior to a > > release and that there are more pressing issues, but it doesn't hurt > > having an open discussion about it. > > > > chris > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Fri Jun 15 18:10:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 18:10:25 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> So should we set up a sandbox svn repository and those who would like to help out - take shots at migrating bioperl (any current cvs snapshot will do) to svn - you document what you find yourself having to do in trying to make it work - you report back when you think you have a working repository - we all get a defined amount of time to test to our hearts' content, say 2 weeks - you fix issues that were encountered - report back when done, followed by retesting for, say 1 week - iterate previous 2 steps until no issues and no objections to migration - two more weeks of warning period to all developers to commit all outstanding changes, or reapply them to a future svn checkout - pull the trigger by locking down cvs, applying the migration as worked out before, and announcing that BioPerl is now on svn - get free beer at next BOSC (I'll pay if no one else does) This may not be precisely the plan that needs to be executed, but it's probably somewhere along those lines. If there are volunteers who would like to spearhead this, then power to you - I think everyone is in favor and the advantages of svn don't need to be debated. The only reason it hasn't happened yet is because no one has stepped forward who would have the energy. I'm sure ChrisD will gladly create the svn sandbox if we have volunteers lined up to get going. -hilmar On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > On 6/15/07, rvos wrote: >> Hi, >> >> I would very much prefer it if bioperl moved to svn. I'm >> considering merging Bio::Phylo (to the extent that that's possible/ >> practical) with bioperl and move it to an OBF repository, but I'd >> rather not go back to CVS. >> >> Rutger >> > > I second that, SVN seems like the reasonable choice. I would be more > than happy to help out as well. > > Spiros > >> >> -----Original Message----- >> >>> Date: Fri Jun 15 07:56:23 PDT 2007 >>> From: "Chris Fields" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sendu Bala" >>> >>> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> >>>>>>> ... >>>>>> Can we do any sort of massive conversion at some logical >>>>>> timepoint. >>>>>> Probably after a branch release or something? Because it >>>>>> basically >>>>>> means we're going to have differences on nearly every line >>>>>> which is >>>>>> going to make diff-ing difficult when debugging old/new versions. >>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>> bugs! >>>> >>>> Sorry, can you clarify the problem you envisage? And why would >>>> making a branch release help? >>> >>> Maybe the worry is that mass conversion in such a large codebase >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>> w/o >>> trying? >>> >>>>> I agree; if we intend on doing this it should be all at once, >>>>> maybe on a branch dedicated to ensure that code changes don't >>>>> tank tests (they shouldn't but one never knows). We would then >>>>> need a script up- and-running that tidies everything up prior to >>>>> commits (though what happens if perltidy tanks?...). >>>>> Sendu, up for it? >>>> >>>> If its going to be difficult and a hassle, for such an unnecessary >>>> thing I'm not sure its worth it. There are more pressing things to >>>> be done for Bioperl. >>>> >>>> If I can just run perltidy on the entire package and commit, I'd do >>>> it. If that's not appropriate, I won't. >>> >>> The choices aren't necessarily all or nothing. What about >>> voluntary, >>> recommended use of a perltidy config file included with the >>> distribution, with additional 'caveats'? See my response to Sean. >>> >>>>>>> About svn >>>> [snip] >>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>> is involved and try getting something going in the next month or >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>> well but it might be worth looking into. >>>> >>>> I'd put this in the unnecessary-but-nice category as well. If it >>>> will be as easy as my ->new change, go ahead. If not, there are >>>> more pressing matters (POD fixing, test script updating and >>>> finishing...). >>> >>> A few other open-bio projects have actively discussed a CVS->SVN >>> migration (BioRuby and I think BioPython, though the latter could be >>> wrong). As I said, "it might be worth looking into" to weigh the >>> pros/cons, get others opinions from others who have made the >>> transition, etc. We could, as Jason suggested, even set up a tester >>> SVN w/o making it the default codebase (lock it off to a few >>> testers, >>> have CVS commits automatically/manually carry over to SVN, etc). >>> >>> I agree with you that it's not feasible to switch over prior to a >>> release and that there are more pressing issues, but it doesn't hurt >>> having an open discussion about it. >>> >>> chris >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Fri Jun 15 18:23:15 2007 From: jason at bioperl.org (Jason Stajich) Date: Fri, 15 Jun 2007 15:23:15 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: Sounds like a plan, I'll be curious to see if we can still get keep anonymous CVS working as I'd like to not have to pull the plug on that. There are some threads out on the web about how to do this with a commit rule on SVN. Also, can someone who is close enough to all the SVN benefits please elaborate how it is going to help _this_ project? Perhaps you would be willing to put a few words up -- like on (a to be created): http://bioperl.org/wiki/BioPerl:Version_control_changeover This way if anonymous CVS is broken and/or developers who haven't been paying attention come back to commit code ask why things changed we don't have to compose long emails... =) -jason On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From sheris at eps.berkeley.edu Fri Jun 15 18:58:12 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 15:58:12 -0700 Subject: [Bioperl-l] seq doesn't validate error Message-ID: <200706151558.12911.sheris@eps.berkeley.edu> Hi, I'm getting an error as follows when I try to reverse complement a sequence string stored in a hash of arrays. The storage code is: $nstarthash{$key} = [$sortchecks[0], join("", @nseq), join("",@{$seqhash{$key}})]; the sequence of interest is the element at index 1. Later, I try to retrieve this string for a subset of keys so I can reverse complement it based on input from another hash (%complement): my %revcomphash = map { my $read = $_; grep $complement{$read} eq 'C', %complement; {$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};} keys(%nstarthash); I get the following warning (long sequence edited for clarity): -- -------------------- WARNING --------------------- MSG: seq doesn't validate, mismatch is 1 --------------------------------------------------- ------------- EXCEPTION ------------- MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] which does not look healthy STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK toplevel ../quality_wrapper.pl:103 I cannot find any non-allowed characters in the sequence, and the de-referencing appears to work correctly. Can anyone help me? I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a Mepis 6.5 system. Thanks Sheri --------------------------------------------------------------------- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From Kevin.M.Brown at asu.edu Fri Jun 15 19:11:34 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Fri, 15 Jun 2007 16:11:34 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> > I'm getting an error as follows when I try to reverse > complement a sequence string stored in a hash of arrays. The > storage code is: > > $nstarthash{$key} = [$sortchecks[0], join("", > @nseq), > join("",@{$seqhash{$key}})]; > > the sequence of interest is the element at index 1. > > Later, I try to retrieve this string for a subset of keys so > I can reverse complement it based on input from another hash > (%complement): > > my %revcomphash = map { my $read = $_; > grep $complement{$read} eq 'C', %complement; > {$_, (Bio::Seq->new(-seq > =>$nstarthash{$_}[1]))->revcom->seq()};} > keys(%nstarthash); > > > I get the following warning (long sequence edited for clarity): > > -- -------------------- WARNING --------------------- > MSG: seq doesn't validate, mismatch is 1 > --------------------------------------------------- > > ------------- EXCEPTION ------------- > MSG: Attempting to set the sequence to > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > which does not look healthy > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > toplevel ../quality_wrapper.pl:103 > > I cannot find any non-allowed characters in the sequence, and > the de-referencing appears to work correctly. Can anyone help me? > I'm using the latest Bioperl installation (1.5.2) with > ActivePerl5.8 on a Mepis 6.5 system. Try telling the Bio::Seq object what alphabet to use when creating it. I tend to create them like: Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') From sheris at eps.berkeley.edu Fri Jun 15 19:53:04 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 16:53:04 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> Message-ID: <200706151653.04135.sheris@eps.berkeley.edu> Thanks for the suggestion, but that still gives the same error as before. On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: > > I'm getting an error as follows when I try to reverse > > complement a sequence string stored in a hash of arrays. The > > storage code is: > > > > $nstarthash{$key} = [$sortchecks[0], join("", > > @nseq), > > join("",@{$seqhash{$key}})]; > > > > the sequence of interest is the element at index 1. > > > > Later, I try to retrieve this string for a subset of keys so > > I can reverse complement it based on input from another hash > > (%complement): > > > > my %revcomphash = map { my $read = $_; > > grep $complement{$read} eq 'C', %complement; > > {$_, (Bio::Seq->new(-seq > > =>$nstarthash{$_}[1]))->revcom->seq()};} > > keys(%nstarthash); > > > > > > I get the following warning (long sequence edited for clarity): > > > > -- -------------------- WARNING --------------------- > > MSG: seq doesn't validate, mismatch is 1 > > --------------------------------------------------- > > > > ------------- EXCEPTION ------------- > > MSG: Attempting to set the sequence to > > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > > which does not look healthy > > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > > toplevel ../quality_wrapper.pl:103 > > > > I cannot find any non-allowed characters in the sequence, and > > the de-referencing appears to work correctly. Can anyone help me? > > I'm using the latest Bioperl installation (1.5.2) with > > ActivePerl5.8 on a Mepis 6.5 system. > > Try telling the Bio::Seq object what alphabet to use when creating it. > I tend to create them like: > > Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') -- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From hlapp at gmx.net Fri Jun 15 21:27:42 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 21:27:42 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: Could you post a ticket to the helpdesk: support at open-bio.org. -hilmar On Jun 15, 2007, at 9:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Fri Jun 15 21:08:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 15 Jun 2007 21:08:32 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <18035.14352.963113.473274@almost.alerce.com> Hilmar Lapp writes: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn Free Beer, huh? Do you deliver? Can you package up a tarball of the cvs repository (bzip or gzip would save some time) itself? thanks! g. From cjfields at uiuc.edu Fri Jun 15 21:42:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:42:05 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> The browsable CVS has a 'Download tarball' link if that helps. http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? cvsroot=bioperl chris On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. From cjfields at uiuc.edu Fri Jun 15 21:50:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:50:09 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> I'll help out to the extent I can w/o having the SVN know-how. We need (as Jason points out) someone who can detail the benefits and maybe keep an updated journal on the wiki. I believe at least one or two of the other Bio* contemplated moving over to SVN, which may be worth checking out. chris On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Jun 15 22:12:55 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 22:12:55 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> I think he meant the cvs repository itself, containing all the change data. -hilmar On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > The browsable CVS has a 'Download tarball' link if that helps. > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? > cvsroot=bioperl > > chris > > On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > >> Hilmar Lapp writes: >>> So should we set up a sandbox svn repository and those who would >>> like >>> to help out >>> >>> - take shots at migrating bioperl (any current cvs snapshot will do) >>> to svn >> >> Free Beer, huh? Do you deliver? >> >> Can you package up a tarball of the cvs repository (bzip or gzip >> would >> save some time) itself? >> >> thanks! >> >> g. > > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Jun 15 22:37:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 21:37:55 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: Ah, got it. Sorry. George, planning on taking this up? chris On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote: > I think he meant the cvs repository itself, containing all the > change data. -hilmar > > On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > >> The browsable CVS has a 'Download tarball' link if that helps. >> >> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? >> cvsroot=bioperl >> >> chris >> >> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: >> >>> Hilmar Lapp writes: >>>> So should we set up a sandbox svn repository and those who would >>>> like >>>> to help out >>>> >>>> - take shots at migrating bioperl (any current cvs snapshot will >>>> do) >>>> to svn >>> >>> Free Beer, huh? Do you deliver? >>> >>> Can you package up a tarball of the cvs repository (bzip or gzip >>> would >>> save some time) itself? >>> >>> thanks! >>> >>> g. >> >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sat Jun 16 04:20:57 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 16 Jun 2007 09:20:57 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <46739D69.4090204@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Hilmar Lapp writes: > > So should we set up a sandbox svn repository and those who would like > > to help out > > > > - take shots at migrating bioperl (any current cvs snapshot will do) > > to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds like George might know what he's doing! I have a question about setting up svn access. I believe access can be done in several ways, over webdav, over ssh and probably others too. Do you have any knowledge about the benefits of one over the other? I suppose I'm thinking of what to implement to allow anonymous read access for users and authenticated access for developers. Nath p.s. if you need any monkeys to do some work I'm happy to help out as much as possible. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9 Lb9NUEe4dkCakQ+Gc7Py98A= =BG9m -----END PGP SIGNATURE----- From rvos at interchange.ubc.ca Sat Jun 16 06:37:11 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca> I can volunteer some time to help out with this. Rutger -----Original Message----- > Date: Fri Jun 15 15:10:25 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: spiros at lokku.com > > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > > > On 6/15/07, rvos wrote: > >> Hi, > >> > >> I would very much prefer it if bioperl moved to svn. I'm > >> considering merging Bio::Phylo (to the extent that that's possible/ > >> practical) with bioperl and move it to an OBF repository, but I'd > >> rather not go back to CVS. > >> > >> Rutger > >> > > > > I second that, SVN seems like the reasonable choice. I would be more > > than happy to help out as well. > > > > Spiros > > > >> > >> -----Original Message----- > >> > >>> Date: Fri Jun 15 07:56:23 PDT 2007 > >>> From: "Chris Fields" > >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > >>> To: "Sendu Bala" > >>> > >>> > >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > >>> > >>>>>>> ... > >>>>>> Can we do any sort of massive conversion at some logical > >>>>>> timepoint. > >>>>>> Probably after a branch release or something? Because it > >>>>>> basically > >>>>>> means we're going to have differences on nearly every line > >>>>>> which is > >>>>>> going to make diff-ing difficult when debugging old/new versions. > >>>>>> Maybe it is not a problem because we aren't introducing and new > >>>>>> bugs! > >>>> > >>>> Sorry, can you clarify the problem you envisage? And why would > >>>> making a branch release help? > >>> > >>> Maybe the worry is that mass conversion in such a large codebase > >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows > >>> w/o > >>> trying? > >>> > >>>>> I agree; if we intend on doing this it should be all at once, > >>>>> maybe on a branch dedicated to ensure that code changes don't > >>>>> tank tests (they shouldn't but one never knows). We would then > >>>>> need a script up- and-running that tidies everything up prior to > >>>>> commits (though what happens if perltidy tanks?...). > >>>>> Sendu, up for it? > >>>> > >>>> If its going to be difficult and a hassle, for such an unnecessary > >>>> thing I'm not sure its worth it. There are more pressing things to > >>>> be done for Bioperl. > >>>> > >>>> If I can just run perltidy on the entire package and commit, I'd do > >>>> it. If that's not appropriate, I won't. > >>> > >>> The choices aren't necessarily all or nothing. What about > >>> voluntary, > >>> recommended use of a perltidy config file included with the > >>> distribution, with additional 'caveats'? See my response to Sean. > >>> > >>>>>>> About svn > >>>> [snip] > >>>>> Stepped into that one, didn't I! I'll look into how much effort > >>>>> is involved and try getting something going in the next month or > >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >>>>> well but it might be worth looking into. > >>>> > >>>> I'd put this in the unnecessary-but-nice category as well. If it > >>>> will be as easy as my ->new change, go ahead. If not, there are > >>>> more pressing matters (POD fixing, test script updating and > >>>> finishing...). > >>> > >>> A few other open-bio projects have actively discussed a CVS->SVN > >>> migration (BioRuby and I think BioPython, though the latter could be > >>> wrong). As I said, "it might be worth looking into" to weigh the > >>> pros/cons, get others opinions from others who have made the > >>> transition, etc. We could, as Jason suggested, even set up a tester > >>> SVN w/o making it the default codebase (lock it off to a few > >>> testers, > >>> have CVS commits automatically/manually carry over to SVN, etc). > >>> > >>> I agree with you that it's not feasible to switch over prior to a > >>> release and that there are more pressing issues, but it doesn't hurt > >>> having an open discussion about it. > >>> > >>> chris > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sdavis2 at mail.nih.gov Sat Jun 16 07:21:47 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Sat, 16 Jun 2007 07:21:47 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> Message-ID: <4673C7CB.1030709@mail.nih.gov> Chris Fields wrote: > I'll help out to the extent I can w/o having the SVN know-how. We > need (as Jason points out) someone who can detail the benefits and > maybe keep an updated journal on the wiki. > > I believe at least one or two of the other Bio* contemplated moving > over to SVN, which may be worth checking out. > The bioconductor project is on SVN. The project includes over 200 packages (the equivalent of perl modules) with something around 150-200 ACTIVE developers. They also have a build system for several OSes that operates on a cron-like system with builds of several versions approximately daily. Their system is running at something like revision 30,000, so they have significant experience. If anyone would like technical support, I can certainly ask the folks maintaining their site if they can give some input. Let me know if anyone would like a contact person. As for access, the typical access is over http (or https). Access controls can be set up on the server side while allowing anonymous access for checkout. There are many excellent SVN for every OS, so that should not be a problem. Sean From cjfields at uiuc.edu Sat Jun 16 10:02:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 09:02:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> On Jun 16, 2007, at 6:21 AM, Sean Davis wrote: > Chris Fields wrote: >> I'll help out to the extent I can w/o having the SVN know-how. We >> need (as Jason points out) someone who can detail the benefits and >> maybe keep an updated journal on the wiki. >> >> I believe at least one or two of the other Bio* contemplated moving >> over to SVN, which may be worth checking out. >> > The bioconductor project is on SVN. The project includes over 200 > packages (the equivalent of perl modules) with something around > 150-200 > ACTIVE developers. They also have a build system for several OSes > that > operates on a cron-like system with builds of several versions > approximately daily. Their system is running at something like > revision > 30,000, so they have significant experience. If anyone would like > technical support, I can certainly ask the folks maintaining their > site > if they can give some input. Let me know if anyone would like a > contact > person. > > As for access, the typical access is over http (or https). Access > controls can be set up on the server side while allowing anonymous > access for checkout. There are many excellent SVN for every OS, so > that > should not be a problem. > > Sean It looks like George Hartzell may be taking a crack at it, with Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we could have something testable relatively soon. After that we'll need to work out a few other issues, basically what's on Hilmar's list. chris From hlapp at gmx.net Sat Jun 16 10:40:08 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:40:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net> Just as an aside, even if we can't keep anonymous cvs working, I would think that using apache URL rewriting and a small CGI script that returns an appropriate page redirect we can without too much trouble keep the hyperlinks functional that people may have bookmarked -hilmar On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote: > Sounds like a plan, I'll be curious to see if we can still get keep > anonymous CVS working as I'd like to not have to pull the plug on > that. There are some threads out on the web about how to do this > with a commit rule on SVN. > > Also, can someone who is close enough to all the SVN benefits > please elaborate how it is going to help _this_ project? > Perhaps you would be willing to put a few words up -- like on (a to > be created): > http://bioperl.org/wiki/BioPerl:Version_control_changeover > > This way if anonymous CVS is broken and/or developers who haven't > been paying attention come back to commit code ask why things > changed we don't have to compose long emails... =) > > -jason > On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn >> >> - you document what you find yourself having to do in trying to make >> it work >> >> - you report back when you think you have a working repository >> >> - we all get a defined amount of time to test to our hearts' content, >> say 2 weeks >> >> - you fix issues that were encountered >> >> - report back when done, followed by retesting for, say 1 week >> >> - iterate previous 2 steps until no issues and no objections to >> migration >> >> - two more weeks of warning period to all developers to commit all >> outstanding changes, or reapply them to a future svn checkout >> >> - pull the trigger by locking down cvs, applying the migration as >> worked out before, and announcing that BioPerl is now on svn >> >> - get free beer at next BOSC (I'll pay if no one else does) >> >> This may not be precisely the plan that needs to be executed, but >> it's probably somewhere along those lines. >> >> If there are volunteers who would like to spearhead this, then power >> to you - I think everyone is in favor and the advantages of svn don't >> need to be debated. The only reason it hasn't happened yet is because >> no one has stepped forward who would have the energy. > >> >> I'm sure ChrisD will gladly create the svn sandbox if we have >> volunteers lined up to get going. >> >> -hilmar >> >> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: >> >>> On 6/15/07, rvos wrote: >>>> Hi, >>>> >>>> I would very much prefer it if bioperl moved to svn. I'm >>>> considering merging Bio::Phylo (to the extent that that's possible/ >>>> practical) with bioperl and move it to an OBF repository, but I'd >>>> rather not go back to CVS. >>>> >>>> Rutger >>>> >>> >>> I second that, SVN seems like the reasonable choice. I would be more >>> than happy to help out as well. >>> >>> Spiros >>> >>>> >>>> -----Original Message----- >>>> >>>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>>> From: "Chris Fields" >>>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>>> To: "Sendu Bala" >>>>> >>>>> >>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>>> >>>>>>>>> ... >>>>>>>> Can we do any sort of massive conversion at some logical >>>>>>>> timepoint. >>>>>>>> Probably after a branch release or something? Because it >>>>>>>> basically >>>>>>>> means we're going to have differences on nearly every line >>>>>>>> which is >>>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>>> versions. >>>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>>> bugs! >>>>>> >>>>>> Sorry, can you clarify the problem you envisage? And why would >>>>>> making a branch release help? >>>>> >>>>> Maybe the worry is that mass conversion in such a large codebase >>>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>>> w/o >>>>> trying? >>>>> >>>>>>> I agree; if we intend on doing this it should be all at once, >>>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>>> need a script up- and-running that tidies everything up prior to >>>>>>> commits (though what happens if perltidy tanks?...). >>>>>>> Sendu, up for it? >>>>>> >>>>>> If its going to be difficult and a hassle, for such an >>>>>> unnecessary >>>>>> thing I'm not sure its worth it. There are more pressing >>>>>> things to >>>>>> be done for Bioperl. >>>>>> >>>>>> If I can just run perltidy on the entire package and commit, >>>>>> I'd do >>>>>> it. If that's not appropriate, I won't. >>>>> >>>>> The choices aren't necessarily all or nothing. What about >>>>> voluntary, >>>>> recommended use of a perltidy config file included with the >>>>> distribution, with additional 'caveats'? See my response to Sean. >>>>> >>>>>>>>> About svn >>>>>> [snip] >>>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>>> is involved and try getting something going in the next >>>>>>> month or >>>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>>> well but it might be worth looking into. >>>>>> >>>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>>> more pressing matters (POD fixing, test script updating and >>>>>> finishing...). >>>>> >>>>> A few other open-bio projects have actively discussed a CVS->SVN >>>>> migration (BioRuby and I think BioPython, though the latter >>>>> could be >>>>> wrong). As I said, "it might be worth looking into" to weigh the >>>>> pros/cons, get others opinions >from others who have made the >>>>> transition, etc. We could, as Jason suggested, even set up a >>>>> tester >>>>> SVN w/o making it the default codebase (lock it off to a few >>>>> testers, >>>>> have CVS commits automatically/manually carry over to SVN, etc). >>>>> >>>>> I agree with you that it's not feasible to switch over prior to a >>>>> release and that there are more pressing issues, but it doesn't >>>>> hurt >>>>> having an open discussion about it. >>>>> >>>>> chris >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Jun 16 10:55:09 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:55:09 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > As for access, the typical access is over http (or https). We're using svn+ssh here (NESCent) so the password is the same as the one you set for your account on the server, and you can use public/ private key negotiation for authentication. I think the ability to not provide a password for every single interaction is a requirement. If that requires using svn+ssh or can be made to work through https too I don't know. On sf.net I have to use https for svn and it doesn't ask me for the password each time. Not sure how this works though, maybe some local caching? We should not be using http, or whatever other protocol that sends unencrypted passwords. > Access controls can be set up on the server side while allowing > anonymous access for checkout. There are many excellent SVN for > every OS, so that should not be a problem. On Mac OSX the most convenient way I have found is through fink. It does ask to install 30 other dependencies, which had me balk at first, but me doing it by hand is even worse than fink doing it, so I finally gave in and it's really a breeze. I've not had a single issue. From a sysadmin perspective, what might be worth keeping in mind is that svn is going to store everything in a database (BerkeleyDB I think). I.e., there is no such thing anymore as restoring individual source code files from backup if one gets accidentally corrupted on the server. It seems you have to restore the entire database, i.e., the entire repository. I vaguely recall though that how svn manages the repository is actually configurable and that other storage than DB is possible too. Don't ask me for the pros and cons of one vs the other. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Sat Jun 16 13:09:18 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). Rutger -----Original Message----- > Date: Sat Jun 16 07:55:09 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sean Davis" > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. > > > Access controls can be set up on the server side while allowing > > anonymous access for checkout. There are many excellent SVN for > > every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rvos at interchange.ubc.ca Sat Jun 16 13:15:45 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore). Rutger -----Original Message----- > Date: Sat Jun 16 10:09:18 PDT 2007 > From: "rvos" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Hilmar Lapp" , "Sean Davis" > > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > > -----Original Message----- > > > Date: Sat Jun 16 07:55:09 PDT 2007 > > From: "Hilmar Lapp" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sean Davis" > > > > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > > > As for access, the typical access is over http (or https). > > > > We're using svn+ssh here (NESCent) so the password is the same as the > > one you set for your account on the server, and you can use public/ > > private key negotiation for authentication. > > > > I think the ability to not provide a password for every single > > interaction is a requirement. If that requires using svn+ssh or can > > be made to work through https too I don't know. On sf.net I have to > > use https for svn and it doesn't ask me for the password each time. > > Not sure how this works though, maybe some local caching? > > > > We should not be using http, or whatever other protocol that sends > > unencrypted passwords. > > > > > Access controls can be set up on the server side while allowing > > > anonymous access for checkout. There are many excellent SVN for > > > every OS, so that should not be a problem. > > > > On Mac OSX the most convenient way I have found is through fink. It > > does ask to install 30 other dependencies, which had me balk at > > first, but me doing it by hand is even worse than fink doing it, so I > > finally gave in and it's really a breeze. I've not had a single issue. > > > > From a sysadmin perspective, what might be worth keeping in mind is > > that svn is going to store everything in a database (BerkeleyDB I > > think). I.e., there is no such thing anymore as restoring individual > > source code files from backup if one gets accidentally corrupted on > > the server. It seems you have to restore the entire database, i.e., > > the entire repository. I vaguely recall though that how svn manages > > the repository is actually configurable and that other storage than > > DB is possible too. Don't ask me for the pros and cons of one vs the > > other. > > > > -hilmar > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From george.heller at yahoo.com Sat Jun 16 13:29:26 2007 From: george.heller at yahoo.com (George Heller) Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com> Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? George --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From bix at sendu.me.uk Sat Jun 16 14:21:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 16 Jun 2007 19:21:38 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com> References: <959624.48556.qm@web56502.mail.re3.yahoo.com> Message-ID: <46742A32.90305@sendu.me.uk> George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). From cjfields at uiuc.edu Sat Jun 16 15:23:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:23:43 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote: > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > >> As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. Agreed; it should be through ssh. >> Access controls can be set up on the server side while allowing >> anonymous access for checkout. There are many excellent SVN for >> every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. MacPorts/DarwinPorts also has subversion, various language bindings, cvs2svn, and various perl modules. There are also a few SVN GUIs lingering around (including live folders within Komodo). chris From cjfields at uiuc.edu Sat Jun 16 15:18:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:18:06 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu> I think it's viable as an option if the code really needs it. After 100+ commits some of the code has schizy coding styles, so cleaning it up helps. In those cases having a perltidy config file present wouldn't hurt. However I agree that it shouldn't be applied across every module and should be done judiciously (the commit message, for instance, should actually state the code was tidied). chris PS - Nice to see the ball is rolling on SVN! On Jun 16, 2007, at 12:15 PM, rvos wrote: > A brief word on the topic of perltidy: no. I like what it does, and > I sort of follow one of its settings (-syn -sob -b), but if you run > it on a whole source tree it'll screw up the diffs, and I'm still > worried about it breaking things (though really it shouldn't, it > creates a *.bak if something doesn't compile anymore). > > Rutger > > > > -----Original Message----- > >> Date: Sat Jun 16 10:09:18 PDT 2007 >> From: "rvos" >> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >> To: "Hilmar Lapp" , "Sean Davis" >> >> >> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales >> talk has been expended over it already, for my own purpose I like >> the integration with eclipse (through subclipse plugin) and >> komodo, in addition to the atomic commits (so I can ctrl+c if I >> goof up (again)). >> >> For standalone use on osx I didn't use the fink one, but I forgot >> where I did get it from. It was very easy to set up, though. On >> windows there is a really nice standalone one (tortoisesvn) that >> integrates with the explorer so you can see on the file icons what >> the state of a file is. I know that there's a cvs2svn utility that >> converts your revision history (seems a requirement). >> >> Rutger >> >> >> -----Original Message----- >> >>> Date: Sat Jun 16 07:55:09 PDT 2007 >>> From: "Hilmar Lapp" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sean Davis" >>> >>> >>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >>> >>>> As for access, the typical access is over http (or https). >>> >>> We're using svn+ssh here (NESCent) so the password is the same as >>> the >>> one you set for your account on the server, and you can use public/ >>> private key negotiation for authentication. >>> >>> I think the ability to not provide a password for every single >>> interaction is a requirement. If that requires using svn+ssh or can >>> be made to work through https too I don't know. On sf.net I have to >>> use https for svn and it doesn't ask me for the password each time. >>> Not sure how this works though, maybe some local caching? >>> >>> We should not be using http, or whatever other protocol that sends >>> unencrypted passwords. >>> >>>> Access controls can be set up on the server side while allowing >>>> anonymous access for checkout. There are many excellent SVN for >>>> every OS, so that should not be a problem. >>> >>> On Mac OSX the most convenient way I have found is through fink. It >>> does ask to install 30 other dependencies, which had me balk at >>> first, but me doing it by hand is even worse than fink doing it, >>> so I >>> finally gave in and it's really a breeze. I've not had a single >>> issue. >>> >>> From a sysadmin perspective, what might be worth keeping in >>> mind is >>> that svn is going to store everything in a database (BerkeleyDB I >>> think). I.e., there is no such thing anymore as restoring individual >>> source code files from backup if one gets accidentally corrupted on >>> the server. It seems you have to restore the entire database, i.e., >>> the entire repository. I vaguely recall though that how svn manages >>> the repository is actually configurable and that other storage than >>> DB is possible too. Don't ask me for the pros and cons of one vs the >>> other. >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 16 13:47:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 10:47:01 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: <18036.8725.29073.619527@almost.alerce.com> Chris Fields writes: > Ah, got it. Sorry. > > George, planning on taking this up? I'm going to take a *peek*. I just finished (unless someone finds another issue) moving someone's cvs repository over to svn, so I have some tools cobbled together and some knowledge in the cache. I don't have too much idle time at the moment though, so if it gets gooey I'll just summarize what I learn. Either way it seems worth a peek. I will need the repository itself though. I'll post a note to support at open-bio.org. g. From jason at bioperl.org Sat Jun 16 19:54:18 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 16:54:18 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18036.8725.29073.619527@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> <18036.8725.29073.619527@almost.alerce.com> Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org> Thanks George. I'll respond to your support ticket as well but I put up tarballs of the repository as of today. I had thought at one point ChrisD might have setup rsync-able access to the whole repostitory through code.open-bio.org but for now I have put up tarballs of most of the CVS dirs from bioperl http://bioperl.org/uploads/ Just to say I already went through all the steps of running cvs2svn myself and had problems gathering back out the branches and all the tags when I tried it. If you want to start with a smaller repository like bioperl-network or bioperl-db as the initial cvs2svn conversion script took quite a long time to run on bioperl-live. Regarding ssh/https: We have already gone through some of this for blipkit and biojava projects. I think we'll still keep separate anonymous read-only (code.open-bio.org) and writeable repositories (dev.open-bio.org) as I think we are resisting any webapps on the developement server as we want that to as locked down as possible. For the newly created svn repositories that I've been creating/using I just use svn+ssh and that worked okay. -jason On Jun 16, 2007, at 10:47 AM, George Hartzell wrote: > Chris Fields writes: >> Ah, got it. Sorry. >> >> George, planning on taking this up? > > I'm going to take a *peek*. I just finished (unless someone finds > another issue) moving someone's cvs repository over to svn, so I have > some tools cobbled together and some knowledge in the cache. > > I don't have too much idle time at the moment though, so if it gets > gooey I'll just summarize what I learn. Either way it seems worth a > peek. > > I will need the repository itself though. I'll post a note to > support at open-bio.org. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hartzell at alerce.com Sat Jun 16 19:56:09 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 16:56:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46739D69.4090204@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <46739D69.4090204@sheffield.ac.uk> Message-ID: <18036.30873.609341.181853@almost.alerce.com> Nathan S. Haigh writes: > [...] > Sounds like George might know what he's doing! Hey, I've been looking for a Marketing Director. Want a job? > I have a question about > setting up svn access. I believe access can be done in several ways, > over webdav, over ssh and probably others too. Do you have any knowledge > about the benefits of one over the other? I suppose I'm thinking of what > to implement to allow anonymous read access for users and authenticated > access for developers. There are two and a half ways to talk to the repository: - You can put it behind a web server (e.g. apache) and get at it using http/https. Authentication and authorization happen using the normal web server tricks, so as long as you don't do anything silly (e.g. don't use basic auth, stick with mod_auth_digest), even http connections won't send passwords in the clear. You can define users in .htpassword files or use any of the fancier setup (e.g. sql databases, etc...). - You can talk to it via subversion's simple server, svnserve. There are two ways you usually talk to svnserve (neither of which send passwords in the clear): * directly, using a URL like svn:/svn.example.com/repo/proj/trunk when you do this the client either talks directly to a copy of svnserve running as a daemon, or possibly to something like inetd that'll start an svnserve as necessary. In this case, you define authen. and author. info in an svnserve.conf file. * indirectly, using a URL like svn+ssh://svn.example.com/repo/proj/trunk/ in which case you make an ssh connection to the server machine (and authenticate via ssh mechanisms, anything other than a key-pair will drive you nuts with repeated password requests) and then an svnserve process is started up for you in "tunnel mode". Access control is coarse grained an via OS level access permisions. Generally in this case you need to give out shell accounts to everyone involved, or (tsk, tsk) have them use a common account. There's a cute trick in the svn book that shows how to use a shared ssh account but still have all of the changes in the repo keep track of the real user. I've never tried it.... - If you're on the same machine as the repo, you can do this simple: file:///path/to/repo/proj/trunk The biggest deciding factor is how you want to manage your users and whether you're already messing around with a web server. I've generally worked in small group and everyone's had ssh access, but I've set it up the other ways too. You can even access via multiple paths. The only trick is that the repository needs to be writable by whoever's committing, and if they're running svnserve themselves (file: or svn+ssh:) and things aren't set up right (all the dirs in the repo need to be group writable and have the magic bit set so that any new stuff created is also writable, users umasks and group membership need to be aligned) then things go fubar. Google's your friend here, and each of the OS's/distro's has a standard hack for making this work, usually involving a wrapper app that takes care of things. Feel free to ask any particular questions. Phew, g. From jason at bioperl.org Sat Jun 16 20:17:58 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 17:17:58 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> <200706151653.04135.sheris@eps.berkeley.edu> Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org> There error is clearly saying there must be a symbol or letter in your sequence that violates the regexp. I had modified the code in CVS to actually provide a more informative mismatch error in the error message, but this probably not in the release you are using. Anyways, add this to see what is causing the problem: print join(",",($nstarthash{$_}[1] =~ /([^ $Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n"; -jason On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote: > Thanks for the suggestion, but that still gives the same error as > before. > > On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: >>> I'm getting an error as follows when I try to reverse >>> complement a sequence string stored in a hash of arrays. The >>> storage code is: >>> >>> $nstarthash{$key} = [$sortchecks[0], join("", >>> @nseq), >>> join("",@{$seqhash{$key}})]; >>> >>> the sequence of interest is the element at index 1. >>> >>> Later, I try to retrieve this string for a subset of keys so >>> I can reverse complement it based on input from another hash >>> (%complement): >>> >>> my %revcomphash = map { my $read = $_; >>> grep $complement{$read} eq 'C', %complement; >>> {$_, (Bio::Seq->new(-seq >>> =>$nstarthash{$_}[1]))->revcom->seq()};} >>> keys(%nstarthash); >>> >>> >>> I get the following warning (long sequence edited for clarity): >>> >>> -- -------------------- WARNING --------------------- >>> MSG: seq doesn't validate, mismatch is 1 >>> --------------------------------------------------- >>> >>> ------------- EXCEPTION ------------- >>> MSG: Attempting to set the sequence to >>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] >>> which does not look healthy >>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 >>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 >>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK >>> toplevel ../quality_wrapper.pl:103 >>> >>> I cannot find any non-allowed characters in the sequence, and >>> the de-referencing appears to work correctly. Can anyone help me? >>> I'm using the latest Bioperl installation (1.5.2) with >>> ActivePerl5.8 on a Mepis 6.5 system. >> >> Try telling the Bio::Seq object what alphabet to use when creating >> it. >> I tend to create them like: >> >> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') > > -- > Sheri Simmons > Department of Earth and Planetary Sciences > University of California, Berkeley > Berkeley, CA 94720-4767 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From n.haigh at sheffield.ac.uk Sun Jun 17 07:45:11 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 17 Jun 2007 12:45:11 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <46751EC7.8020609@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 rvos wrote: > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > Just to clarify, subversion is available as command line for windows: http://subversion.tigris.org/project_packages.html TortoiseSVN is another svn client with a GUI that integrates into the shell. I tried setting this up a while back to use ssh (via PUTTY), but I wasn't successful. This may have been due to me just starting out with svn or that it was harder to setup in an earlier version of TortoiseSVN. Does anyone have experience of setting up svn on Windows to use ssh? If the changeover takes place, I'm happy to write some howto's for setting up svn clients for Windows. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v 8xHJvn/Eqf9LePR3Ei0ZaIw= =t5pN -----END PGP SIGNATURE----- From george.heller at yahoo.com Sun Jun 17 14:41:55 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com> Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From jason at bioperl.org Sun Jun 17 16:48:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Sun, 17 Jun 2007 13:48:05 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com> References: <148654.15952.qm@web56511.mail.re3.yahoo.com> Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: > Hi all, > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > Thanks. > George > > Sendu Bala wrote: > George Heller wrote: >> Hi all, >> >> I am looking at extracting the taxonomy hierarchy for some taxon ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From aaron.j.mackey at gsk.com Sun Jun 17 22:35:42 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:35:42 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: To do so efficiently, you might want to check out: http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html -Aaron bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM: > George Heller wrote: > > Hi all, > > > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > > What I plan to do is, for a given taxon id, say 33090, I want to > > extract all taxon ids that are children of this species. I do not > > just want the immediate children, but the children's children and so > > on. > > > > Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not share it > with us and we could add it to the Taxonomy module(s). > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From aaron.j.mackey at gsk.com Sun Jun 17 22:34:12 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:34:12 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: Message-ID: > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) Let me just note that https is preferable to ssh for those poor slobs stuck behind a corporate firewall (svn happily prompts me for my proxy server's user/pass, then my https authentication realm's user/pass - all then get cached in some .svn/ file that I don't have to worry about again until my proxy server password changes once a month ...) -Aaron From george.heller at yahoo.com Mon Jun 18 00:21:45 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com> Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. From bix at sendu.me.uk Mon Jun 18 06:44:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:44:00 +0100 Subject: [Bioperl-l] Network tests overhaul Message-ID: <467661F0.2060703@sendu.me.uk> When the test suite runs currently, most (the intent is all) tests skip if the test would require network (internet) access. This is to avoid tests failing not due to bugs in Bioperl code, but due to temporarily inaccessible servers. This is also to make running the test suite faster. To do a complete test you currently have to set BIOPERLDEBUG to true, which activates the network test but also increases verbosity. This actually causes a problem, since when running the entire test suite the additional debug information is more a hindrance than a help, since the reams of printed information can hide significant warnings that may also get printed. Its also ugly. The solution is to divorce activation of network tests from the request for verbosity. The obvious implementation is to have another environment variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do something more appropriate. The running of networking tests should be a choice given to every end-user installing Bioperl. Debugging information, on the other hand, is only of interest to the developer working on a specific module under test, so can be left as a 'hidden' env var. I have just committed one possible implementation along these lines. You say: perl Build.PL as normal, and if you seem to have internet access it asks you if you'd like to run network tests. The default answer is no. If you answer yes, network tests will be enabled. You can alternatively say: perl Build.PL --network and if you seem to have internet access, network tests will be enabled. Then you run the tests: ./Build test Any tests written to support the new system will then skip network tests if they haven't been enabled. The only test I've written to support the new system is t/RemoteBlast.t: ./Build test --test_files t/RemoteBlast.t --verbose Adding support to test scripts consists of the following changes: + use Module::Build; + my $build = Module::Build->current(get_options => { network => {} }); + my $do_network_tests = $build->notes('network'); ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests --- ! if (!$do_network_tests) { # skip network tests I propose adding this support to all test scripts that carry out network tests. Does anyone have objections? Does anyone have alternate implementations that may be superior? I specifically suggest we don't use an env var in addition to the above, because the multiple ways of doing things could lead to confusion. Which takes priority? Did a user really have the networking tests turned on when he reported his test results? The one thing I need help with is identifying which tests attempt to access the internet. I think we caught most of them for the 1.5.2 release, but I think there are more lurking around. Can anyone offer a way to systematically find at least the test scripts which access the internet, if not the specific tests within? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 06:46:17 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:46:17 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: <46766279.7050202@sendu.me.uk> Sendu Bala wrote: > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => {} }); That should read: + my $build = Module::Build->current(); > + my $do_network_tests = $build->notes('network'); From cjfields at uiuc.edu Mon Jun 18 07:45:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 06:45:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <46766279.7050202@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: The idea sounds good, though if we plan on doing this we need to update the Test HOWTO as well. Some modules require only a few (<50% of the total) network tests; I think SeqFeature.t may be one, though I'm not sure. Does this handle those cases? chris On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Adding support to test scripts consists of the following changes: >> >> + use Module::Build; >> + my $build = Module::Build->current(get_options => { network => >> {} }); > > That should read: > + my $build = Module::Build->current(); > >> + my $do_network_tests = $build->notes('network'); > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Jun 18 07:49:18 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 12:49:18 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: <4676713E.1000508@sendu.me.uk> Chris Fields wrote: > The idea sounds good, though if we plan on doing this we need to update > the Test HOWTO as well. > > Some modules require only a few (<50% of the total) network tests; I > think SeqFeature.t may be one, though I'm not sure. Does this handle > those cases? Yes, the system just gives the test script a boolean describing if network tests should be run. The script can then do whatever it wants with the boolean. Skip all tests, skip no tests, skip just some tests... its a drop-in replacement for the current 'debug' boolean used based on BIOPERLDEBUG. From hlapp at gmx.net Mon Jun 18 08:38:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:38:25 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com> References: <487845.37410.qm@web56510.mail.re3.yahoo.com> Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 08:44:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:44:22 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Just curious - how do you cvs commit then to an external repository? Is that open in the firewall? It is true though that corporations typically will not permit any encrypted outgoing traffic through their firewall except https. sf.net only supports https for svn, AFAIK. -hilmar On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass > - all > then get cached in some .svn/ file that I don't have to worry about > again > until my proxy server password changes once a month ...) > > -Aaron > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 08:47:56 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:47:56 -0400 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Sounds like a great idea to me. -hilmar On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote: > When the test suite runs currently, most (the intent is all) tests > skip > if the test would require network (internet) access. This is to avoid > tests failing not due to bugs in Bioperl code, but due to temporarily > inaccessible servers. This is also to make running the test suite > faster. > > To do a complete test you currently have to set BIOPERLDEBUG to true, > which activates the network test but also increases verbosity. This > actually causes a problem, since when running the entire test suite > the > additional debug information is more a hindrance than a help, since > the > reams of printed information can hide significant warnings that may > also > get printed. Its also ugly. > > The solution is to divorce activation of network tests from the > request > for verbosity. The obvious implementation is to have another > environment > variable, perhaps BIOPERLNETWORK. However, there is an opportunity > to do > something more appropriate. The running of networking tests should > be a > choice given to every end-user installing Bioperl. Debugging > information, on the other hand, is only of interest to the developer > working on a specific module under test, so can be left as a 'hidden' > env var. > > > I have just committed one possible implementation along these lines. > > You say: > perl Build.PL > as normal, and if you seem to have internet access it asks you if > you'd > like to run network tests. The default answer is no. If you answer > yes, > network tests will be enabled. > > You can alternatively say: > perl Build.PL --network > and if you seem to have internet access, network tests will be > enabled. > > Then you run the tests: > ./Build test > Any tests written to support the new system will then skip network > tests > if they haven't been enabled. > > The only test I've written to support the new system is t/ > RemoteBlast.t: > ./Build test --test_files t/RemoteBlast.t --verbose > > > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => > {} }); > + my $do_network_tests = $build->notes('network'); > > ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests > --- > ! if (!$do_network_tests) { # skip network tests > > > I propose adding this support to all test scripts that carry out > network > tests. Does anyone have objections? Does anyone have alternate > implementations that may be superior? > > I specifically suggest we don't use an env var in addition to the > above, > because the multiple ways of doing things could lead to confusion. > Which > takes priority? Did a user really have the networking tests turned on > when he reported his test results? > > > The one thing I need help with is identifying which tests attempt to > access the internet. I think we caught most of them for the 1.5.2 > release, but I think there are more lurking around. Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 08:55:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 07:55:53 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote: > Just curious - how do you cvs commit then to an external repository? > Is that open in the firewall? > > It is true though that corporations typically will not permit any > encrypted outgoing traffic through their firewall except https. > sf.net only supports https for svn, AFAIK. > > -hilmar If so it may be better to allow https, though I don't know how Chris D. and others feel about it. Did we make a decision as to the fate of cvs if we get svn up-and- running? Keep it around (assuming svn commits would be carried over to cvs and vice versa)? Or see what happens over time? chris From sdavis2 at mail.nih.gov Mon Jun 18 09:05:50 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 09:05:50 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <4676832E.5080704@mail.nih.gov> aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass - all > then get cached in some .svn/ file that I don't have to worry about again > until my proxy server password changes once a month ...) That would be my suggestion as well (although I added it only parenthetically). Sean From hlapp at gmx.net Mon Jun 18 09:13:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 09:13:27 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried > over to cvs and vice versa)? Or see what happens over time? Let's not plan for having cvs and svn writable repositories in parallel - that would create an administrative nightmare. Once the tests complete, there'll be a clean cut-over. What Jason suggested is to try and continue a read-only (anonymous) cvs repository, updated from the svn repository that the developers use, aside from an anonymous svn repository mirroring the writable one. This would primarily be for maintaining working URLs for those folks who http-linked into the anonymous cvs repository. What I added earlier is that even if that fails to be feasible, you can achieve the goal using some small CGI script and apache redirect to map CVS- style links to the anonymous svn repository. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 09:31:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:31:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu> On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote: > > On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Let's not plan for having cvs and svn writable repositories in > parallel - that would create an administrative nightmare. Once the > tests complete, there'll be a clean cut-over. My thoughts as well. Much simpler. > What Jason suggested is to try and continue a read-only (anonymous) > cvs repository, updated from the svn repository that the developers > use, aside from an anonymous svn repository mirroring the writable > one. This would primarily be for maintaining working URLs for those > folks who http-linked into the anonymous cvs repository. What I > added earlier is that even if that fails to be feasible, you can > achieve the goal using some small CGI script and apache redirect to > map CVS-style links to the anonymous svn repository. > > -hilmar I like the idea of a read-only cvs or a 'faux' cvs, though the former would initially be easier as we already have it available. We could just lock it down at some switchover point to read-only (something I think Jason also suggested). chris From bix at sendu.me.uk Mon Jun 18 09:13:33 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:13:33 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> Message-ID: <467684FD.3080300@sendu.me.uk> Chris Fields wrote: > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing I'm not sure its worth it. There are more pressing things to be >> done for Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd do >> it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? I'm happy with that idea. Why not come up with something and make it available for us to try out? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 09:26:36 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:26:36 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <4676880C.9030009@sendu.me.uk> Chris Fields wrote: > If so it may be better to allow https, though I don't know how Chris > D. and others feel about it. If it makes no difference to me as an end-user, I won't mind. But I won't want to enter my password even once, at the beginning of a session. If that's not possible with https, then ssh should be an option as well. Unrelated, but it randomly just occurred to me: what happens to all the id lines at the top of modules? Eg: $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ That's a cvs-specific thing, right? Do we delete them all? (Regardless, I wish we would, since they caused me no end of hassles during the 1.5.2 release, doing updates across branches.) > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried over > to cvs and vice versa)? Or see what happens over time? Well, I don't think hard decisions are possible until we know how its going to work in practice. I tried setting up my own svn repository once, but didn't keep it and can't remember much about it. So, I suppose we'll play it by ear and decide things later. Is someone out there actively doing something leading toward a demonstration of how it will be? From cjfields at uiuc.edu Mon Jun 18 09:58:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:58:34 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467684FD.3080300@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing I'm not sure its worth it. There are more pressing things >>> to be >>> done for Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do >>> it. If that's not appropriate, I won't. >> >> The choices aren't necessarily all or nothing. What about voluntary, >> recommended use of a perltidy config file included with the >> distribution, with additional 'caveats'? > > I'm happy with that idea. Why not come up with something and make it > available for us to try out? > > > Cheers, > Sendu. Will do. Maybe something that conforms to PBP; there's a PBP perltidy config on perlmonks, along with some emacs/vim related bits: http://www.perlmonks.org/?node_id=516501 chris From sdavis2 at mail.nih.gov Mon Jun 18 10:03:35 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 10:03:35 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <467690B7.7090105@mail.nih.gov> Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how Chris >> D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an option > as well. > > > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) See here: http://svnbook.red-bean.com/en/1.0/ch07s02.html Check out the section at the bottom having to do with svn:keywords. Sean From akarger at CGR.Harvard.edu Mon Jun 18 10:10:57 2007 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 18 Jun 2007 10:10:57 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46751EC7.8020609@sheffield.ac.uk> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> <46751EC7.8020609@sheffield.ac.uk> Message-ID: > Just to clarify, subversion is available as command line for windows: > http://subversion.tigris.org/project_packages.html > > TortoiseSVN is another svn client with a GUI that integrates into the > shell. I tried setting this up a while back to use ssh (via > PUTTY), but > I wasn't successful. This may have been due to me just > starting out with > svn or that it was harder to setup in an earlier version of > TortoiseSVN. > > Does anyone have experience of setting up svn on Windows to > use ssh? If > the changeover takes place, I'm happy to write some howto's > for setting > up svn clients for Windows. Here are some notes I wrote recently. I'm using this with command-line svn, not TortoiseSVN. I would hope that it would work with Tortoise, too, but I can't guarantee. 1. Run PuTTYgen (installed with PuTTY, probably in Start menu->Programs->PuTTY) and follow directions to create a private key file like C:\someplace\private_key.ppk and a public key. At this point, you'll pick an ssh password, which is separate from your login password. 2. Get an account with the appropriate .ssh/authorized_keys file on the host machine. (This is not Windows-specific. By the way, if you change the lines of the authorized_keys file to start with, e.g., command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB... comment then (a) you're more secure because users can't open a real shell on the computer, and (b) users don't need to type the repository directory in their svn co commands.) 3. Set your environment variables (My Computer->Properties. Advanced Tab, click on Environment Variables. In the top half ("User variables for ..."), click "New" and put in the variable name and value. 3a. Set the SVN_EDITOR environment variable to your favorite editor, such as vim or emacs, or a full path to some other editor. If it's not set, then either VISUAL or EDITOR must be set. 3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program, which is the Windows equivalent of command-line ssh. If you installed PuTTY in the default location, set it to "C:/Program Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the quotation marks in the environment variable. 4. When you want to start using svn, you'll need to run Pageant (Start menu->Programs->PuTTY), select "Add Key", browse to your private key file, and enter the ssh password you chose in step 1 (not your login password). Pageant will stay running until you quit it or logout, so you can have multiple svn checkins etc., and you only need to type in your password once. 5. Now just run command-line svn commands the same way you would on UNIX (modulo Windows' brain-dead shell). -Amir Karger From cjfields at uiuc.edu Mon Jun 18 10:24:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 09:24:00 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how >> Chris D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an > option as well. Aaron pointed out in a related post that https access is the preferred option behind a corporate firewall (svn prompts for proxy user/pass, then caches it). Not sure how Jason/Hilmar/Chris D. feel about https or supporting both https+ssh. ... >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Well, I don't think hard decisions are possible until we know how > its going to work in practice. I tried setting up my own svn > repository once, but didn't keep it and can't remember much about it. Agree; we'll need to work out specifics once we know how things work out using cvs2svn. I think the idea is to test using a smaller distribution (maybe network or db) and move up from there. > So, I suppose we'll play it by ear and decide things later. Is > someone out there actively doing something leading toward a > demonstration of how it will be? George Hartzell is going to test it out, I believe, and will post something when he can. chris From dmessina at wustl.edu Mon Jun 18 10:54:31 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 09:54:31 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> [Chris F] > Will do. Maybe something that conforms to PBP; there's a PBP > perltidy config on perlmonks, along with some emacs/vim related bits: > > http://www.perlmonks.org/?node_id=516501 FYI, perltidy now has a built-in -pbp flag: [from perltidy-20070508] > -pbp, --perl-best-practices > -pbp is an abbreviation for the parameters in the book Perl Best > Practices by Damian Conway: > > -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 > -nsfs -nolq > -wbb="% + - * / x != == >= <= =~ !~ < > | & = > **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" > Note that the -st and -se flags make perltidy act as a filter on > one file only. These can be overridden with -nst and -nse if > necessary. > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ bin/perltidy] Dave From dmessina at wustl.edu Mon Jun 18 11:04:10 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 10:04:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Awesome, Sendu! Really glad you implemented this. > Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? I think tests would be accessing the net indirectly through a BioPerl module (which may also be using indirect access), so it'd be hard to come up with a universal glob for that. However: % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l 108 % ls -1 bioperl-live/t | wc -l 248 Less than half of the test files use BIOPERLDEBUG, so that narrows down the possibilities... Dave From bix at sendu.me.uk Mon Jun 18 11:09:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 16:09:19 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> Message-ID: <4676A01F.30205@sendu.me.uk> David Messina wrote: >> Can anyone offer a >> way to systematically find at least the test scripts which access the >> internet, if not the specific tests within? > > I think tests would be accessing the net indirectly through a BioPerl > module (which may also be using indirect access), so it'd be hard to > come up with a universal glob for that. > > However: > > % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l > 108 > > % ls -1 bioperl-live/t | wc -l > 248 > > Less than half of the test files use BIOPERLDEBUG, so that narrows down > the possibilities... Not necessarily. The problem is that there may be test scripts that have never even tried to skip network tests, and therefore don't use BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) I was thinking along the lines of, does anyone know how to monitor accesses to the network card (or equivalent), getting information on which program (test script) requested the access? From cjfields at uiuc.edu Mon Jun 18 11:41:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 10:41:28 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> Message-ID: On Jun 18, 2007, at 9:54 AM, David Messina wrote: > [Chris F] >> Will do. Maybe something that conforms to PBP; there's a PBP >> perltidy config on perlmonks, along with some emacs/vim related bits: >> >> http://www.perlmonks.org/?node_id=516501 > > > FYI, perltidy now has a built-in -pbp flag: > > [from perltidy-20070508] >> -pbp, --perl-best-practices >> -pbp is an abbreviation for the parameters in the book Perl Best >> Practices by Damian Conway: >> >> -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 >> -nsfs -nolq >> -wbb="% + - * / x != == >= <= =~ !~ < > | & = >> **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" >> Note that the -st and -se flags make perltidy act as a filter on >> one file only. These can be overridden with -nst and -nse if >> necessary. >> > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ > bin/perltidy] > > > Dave Makes sense that would eventually be incorporated. If so there's no need to include a config (unless we want to sway away from PBP-style). We can just recommend everyone use that setting. chris From cjfields at uiuc.edu Mon Jun 18 12:06:26 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:06:26 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote: > David Messina wrote: >>> ... >> Less than half of the test files use BIOPERLDEBUG, so that narrows >> down >> the possibilities... > > Not necessarily. The problem is that there may be test scripts that > have > never even tried to skip network tests, and therefore don't use > BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) > > I was thinking along the lines of, does anyone know how to monitor > accesses to the network card (or equivalent), getting information on > which program (test script) requested the access? EUtilities.t uses network tests predominately. I'll switch over when I commit everything from the overhaul. Couldn't you enable BIOPERLDEBUG, disable network access, then iterate through tests checking for those which fail or skip? I think Test::Harness has a way to do this, using execute_tests(). chris From bix at sendu.me.uk Mon Jun 18 12:34:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 17:34:38 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> Message-ID: <4676B41E.3050706@sendu.me.uk> Chris Fields wrote: > Couldn't you enable BIOPERLDEBUG, disable network access, then iterate > through tests checking for those which fail or skip? Yes, good idea, though my dev machine is also my email/webserver so I'd rather come up with an alternate solution than one involving 'disable network access'. Still, that's what I'll probably end up doing. Cheers! Oh, Chris, Spiros, how goes the Test::More conversion? I might want to wait for you to finish, or join in? If you're not going to have time to do any more in the next few weeks, can you please update http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in the opposite case, add your name in)? Its not quite clear to me which tests are assigned to whom. Can someone clarify what the markings mean? Cheers, Sendu. From cjfields at uiuc.edu Mon Jun 18 12:43:31 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:43:31 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676B41E.3050706@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > Chris Fields wrote: >> Couldn't you enable BIOPERLDEBUG, disable network access, then >> iterate through tests checking for those which fail or skip? > > Yes, good idea, though my dev machine is also my email/webserver so > I'd rather come up with an alternate solution than one involving > 'disable network access'. > > Still, that's what I'll probably end up doing. Cheers! > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > to wait for you to finish, or join in? If you're not going to have > time to do any more in the next few weeks, can you please update > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > in the opposite case, add your name in)? Its not quite clear to me > which tests are assigned to whom. Can someone clarify what the > markings mean? > > Cheers, > Sendu. Not sure how far along spiros is; I handed it over after I finished up to the 'Q' tests. In general the ones marked out have been converted over, ones with names next to them have been claimed. If you need help I'll prob. start back up again to finish them off; we just need to divy them up. chris From george.heller at yahoo.com Mon Jun 18 13:07:59 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com> What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. From jason at bioperl.org Mon Jun 18 13:53:28 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 10:53:28 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, > > relation "node" does not exist. > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > shift->throw_not_implemented(); > > Thanks. > George. > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > BioPerl doesn't have a Taxonomy::biosql module yet (though this would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download > to achieve what you wanted to do in a less than 5 lines of perl. > > Although the recursive implementation of Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > -hilmar > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > >> Thanks. And how can I assign the $node here in the below code, such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> Thanks. >> George >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> You just want the extant species/leaves of the tree >> >> >> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> Hi all, >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> Thanks. >> George >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> >> Any ideas on the way I can go about doing this? >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: > mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Mon Jun 18 18:10:00 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:10:00 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net> https is working fine for me for sf.net repositories, and I only have to enter the password upon first commit (since checkout doesn't even need a password). -hilmar On Jun 18, 2007, at 10:24 AM, Chris Fields wrote: > Not sure how Jason/Hilmar/Chris D. feel about https or supporting > both https+ssh -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 18:18:21 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com> I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. From hlapp at gmx.net Mon Jun 18 18:27:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:27:19 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: On Jun 18, 2007, at 1:07 PM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, Sorry, replace with "taxon". Jason answered the rest. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 18:33:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 17:33:40 -0500 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: > I tried running the below mentioned script and I seem to be getting > the following error: > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > My script looks something like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > And I am running the script using the command, > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > and I have the nodes.dmp and names.dmp files in the current > directory. > > Thanks, > George > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > > > -jason > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > relation "node" does not exist. > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > shift->throw_not_implemented(); > > > Thanks. > George. > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > -hilmar > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > Thanks. > George > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > You just want the extant species/leaves of the tree > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descedents; > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > Hi all, > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > Thanks. > George > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > > > Any ideas on the way I can go about doing this? > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 18 18:50:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:50:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> Message-ID: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 19:05:42 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com> This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. From jason at bioperl.org Mon Jun 18 19:22:08 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 16:22:08 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com> References: <706979.34648.qm@web56509.mail.re3.yahoo.com> Message-ID: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: > This is the output of /usr/bin/perl -V > > Summary of my perl5 (revision 5 version 8 subversion 5) configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- > linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- > E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > Thanks. > George > . > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > George, can you please post the output of > > $ /usr/bin/perl -V > > -hilmar > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > >> As the error implies your local version of perl doesn't seem support >> weak references, which means it doesn't have Scalar::Utils (which was >> added to core after perl 5.6.1, I think). Try installing >> Scalar::Utils to see what happens. >> >> chris >> >> On Jun 18, 2007, at 5:18 PM, George Heller wrote: >> >>> I tried running the below mentioned script and I seem to be getting >>> the following error: >>> >>> Weak references are not implemented in the version of perl at / >>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >>> Bio/Tree/Node.pm line 76. >>> Compilation failed in require at my.pl line 7. >>> BEGIN failed--compilation aborted at my.pl line 7. >>> >>> My script looks something like, >>> >>> #!/usr/bin/perl >>> use strict; >>> #use warnings; >>> use DBI; >>> use Bio::Tree::Node; >>> use Bio::DB::Taxonomy; >>> use Bio::DB::Taxonomy::flatfile; >>> my $idx_dir = '/tmp'; >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> foreach $field (@extant_children) { >>> print "$field"; >>> print "|"; >>> print "\n"; >>> } >>> >>> And I am running the script using the command, >>> >>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >>> >>> and I have the nodes.dmp and names.dmp files in the current >>> directory. >>> >>> Thanks, >>> George >>> >>> >>> Jason Stajich wrote: >>> It is implemented in the implementing class - DB::Taxonomy is >>> just the base class. For example see the flatfile implementation >>> Bio::DB::Taxonomy::flatfile >>> >>> See the scripts/taxa/local_taxonomydb_query.PLS for example using >>> it: >>> nodes and names are from NCBI taxonomy database. >>> >>> >>> Here is an un-debugged copy+paste for your question that *should* >>> work. >>> >>> >>> use Bio::DB::Taxonomy >>> my $idx_dir = '/tmp'; >>> >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> >>> >>> >>> -jason >>> >>> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >>> >>> What exactly is the "node n" in the query below. When I issue >>> this query, it says, >>> >>> >>> relation "node" does not exist. >>> >>> >>> I tried to use the get_all_Descendents method but it looks like >>> in order to do a recursive call it calls the method >>> each_Descendent. This method is not implemented in >>> Bio::DB::Taxonomy. It just has a single line, >>> >>> >>> shift->throw_not_implemented(); >>> >>> >>> Thanks. >>> George. >>> >>> >>> Hilmar Lapp wrote: >>> I'm a bit confused - it sounds like you have set up a local >>> BioSQL >>> database and loaded the NCBI taxonomy into the database. You can >>> now >>> use simple SQL to retrieve all descendants of a node in the tree >>> given its NCBI taxonID such as >>> >>> >>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >>> WHERE >>> n.ncbi_taxon_id = :taxonID >>> AND tn.left_value > n. left_value >>> AND tn.right_value < n.right_value >>> AND tn.taxon_id = tnm.taxon_id >>> AND tn.name_class = 'scientific_name' >>> >>> >>> BioPerl doesn't have a Taxonomy::biosql module yet (though this >>> would >>> seem like a worthwhile thing to add), so you can't use the >>> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >>> >>> >>> However, BioPerl does have support for the flat-file download of >>> the >>> NCBI taxonomy database and indexes it, so you can simply use >>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >>> download >>> to achieve what you wanted to do in a less than 5 lines of perl. >>> >>> >>> Although the recursive implementation of >>> Taxonomy::get_all_Descendants >>> () won't be lightning fast, it may still be perfectly fine for your >>> application - are you sure it is not? >>> >>> >>> -hilmar >>> >>> >>> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >>> >>> >>> Thanks. And how can I assign the $node here in the below code, >>> such >>> that I can reference it to a particular taxon id record? I want to >>> retrieve all the descendents from the taxonomy hierarchy, given a >>> particular taxon id. >>> >>> >>> I have a local db setup, in which I have uploaded data using the >>> load_ncbi_taxonomy.pl script. >>> >>> >>> Thanks. >>> George >>> >>> >>> Jason Stajich wrote: >>> I assume you already figured out how to setup a local taxonomydb? >>> >>> >>> >>> >>> You just want the extant species/leaves of the tree >>> >>> >>> >>> >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descedents; >>> >>> >>> >>> >>> >>> >>> -jason >>> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >>> >>> >>> Hi all, >>> >>> >>> >>> >>> Can anyone point me to some example that uses the >>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >>> this, and I am not quite sure how to implement it. >>> >>> >>> >>> >>> Thanks. >>> George >>> >>> >>> >>> >>> Sendu Bala wrote: >>> George Heller wrote: >>> Hi all, >>> >>> >>> >>> >>> I am looking at extracting the taxonomy hierarchy for some taxon >>> ids. >>> What I plan to do is, for a given taxon id, say 33090, I want to >>> extract all taxon ids that are children of this species. I do not >>> just want the immediate children, but the children's children >>> and so >>> on. >>> >>> >>> >>> >>> Any ideas on the way I can go about doing this? >>> >>> >>> >>> >>> Well, you'll use Bio::DB::Taxonomy presumably, and >>> each_Descendent in >>> some kind of looping structure. Most easily a recursing sub. >>> >>> >>> >>> >>> If you happen to code up something neat and efficient, why not >>> share it >>> with us and we could add it to the Taxonomy module(s). >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Shape Yahoo! in your own image. Join our Network Research Panel >>> today! >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Need a vacation? Get great deals to amazing places on Yahoo! >>> Travel. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Take the Internet to Go: Yahoo!Go puts the Internet in your >>> pocket: mail, news, photos & more. >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Bored stiff? Loosen up... >>> Download and play hundreds of games for free on Yahoo! Games. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Mon Jun 18 20:04:00 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com> Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. From jason at bioperl.org Mon Jun 18 20:17:34 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 17:17:34 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com> References: <424035.72876.qm@web56507.mail.re3.yahoo.com> Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: > Ok, I installed the latest of Scalar::Util and the script seems to > be working. But I am confused where exactly I need to look for the > descendent taxon ids once the script is run. I did look into the / > tmp/ directory, but I couldnt understand much. > > Sorry to be bothering, really appreaciate your patience. > > Thanks. > George > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > This is the output of /usr/bin/perl -V > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > Thanks. > George > . > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > > George, can you please post the output of > > > $ /usr/bin/perl -V > > > -hilmar > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils (which > was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > chris > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > My script looks something like, > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > And I am running the script using the command, > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > Thanks, > George > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > -jason > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > relation "node" does not exist. > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > shift->throw_not_implemented(); > > > > > Thanks. > George. > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > > > -hilmar > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > Thanks. > George > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > Hi all, > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > Thanks. > George > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Mon Jun 18 20:29:31 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com> But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. From jason at bioperl.org Mon Jun 18 21:05:43 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 18:05:43 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: > But the problem is that I don't really get any output on the > screen. In the /tmp directory I get 4 files namely parents, nodes, > id2names and names2id, but I dont know what to make of them. This > is what my script looks like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > my $nodefile; > my $namesfile; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodefile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > Thanks. > George > > Jason Stajich wrote: > All the children are in this array. > > > You get to decide what you want to do with them. In the following > example I print the id, rank, and scientific name out to the screen. > Because this is a taxonomy db query you are getting back > Bio::Taxonomy::Taxon objects so read the documentation for this > module to see what you can do with the object. > I would also suggest spending a little time with the Getting > started and HOWTO:Trees documentation on the website to get > familiar with the objects and nomenclature. > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > > On Jun 18, 2007, at 5:04 PM, George Heller wrote: > > Ok, I installed the latest of Scalar::Util and the script seems > to be working. But I am confused where exactly I need to look for > the descendent taxon ids once the script is run. I did look into > the /tmp/ directory, but I couldnt understand much. > > > Sorry to be bothering, really appreaciate your patience. > > > Thanks. > George > > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > > This is the output of /usr/bin/perl -V > > > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat > 3.4.6-2)', gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > > > Thanks. > George > . > > > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something > strange > appears to be going on too. > > > > > George, can you please post the output of > > > > > $ /usr/bin/perl -V > > > > > -hilmar > > > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils > (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > > > chris > > > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ > 5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > > > My script looks something like, > > > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > > > And I am running the script using the command, > > > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > > > Thanks, > George > > > > > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > > > > > > > > > -jason > > > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > > > > > relation "node" does not exist. > > > > > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > > > > > shift->throw_not_implemented(); > > > > > > > > > Thanks. > George. > > > > > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for > your > application - are you sure it is not? > > > > > > > > > -hilmar > > > > > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > > > > > Thanks. > George > > > > > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a > newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > > > > > > > > > Thanks. > George > > > > > > > > > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s > user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From torsten.seemann at infotech.monash.edu.au Mon Jun 18 21:21:04 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:21:04 +1000 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: Sendu, > >> Can anyone offer a > >> way to systematically find at least the test scripts which access the > >> internet, if not the specific tests within? Perhaps you could use 'strace' to list network system calls for each test script, and grep out AF_INET connections? % strace -e trace=network command_to_test 2>&1 | grep AF_INET I'm not an strace expert but it might do what you need. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From george.heller at yahoo.com Mon Jun 18 21:16:10 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com> Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help! Thanks. George Jason Stajich wrote: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. From torsten.seemann at infotech.monash.edu.au Mon Jun 18 21:26:41 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:26:41 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: (Sean, please reply to the bioperl-l list rather than to me personally so everyone can read it. i'm reposting it here) > > I posted this on the gbrowse list earlier. I'm looking to convert gff > > data files into xml. Does anyone know of a module written to do this > > already? > > What DTD do you want the XML to conform to? > eg. ChadoXML, TinySeq XML, TIGR XML ... ? Hi Torsten, I'm collaborating with other groups and want web-service compatible functionality for various tools. Normally the analysis tools I'm using generate gff output. I'm going to have to wrap this output in XML with XSL stylesheet for end-users to view. Haven't done it before and don't know what DTD to use. The bp_seqconvert.pl doesn't accept gff format. I would imagine the DTD would be quite short as the gff files are very standard, I just don't have any experience with these DTD requirements. --Sean O'Keeffe From sac at bioperl.org Tue Jun 19 02:42:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 18 Jun 2007 23:42:27 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> On 6/16/07, Jason Stajich wrote: > [...] > Just to say I already went through all the steps of running cvs2svn > myself and had problems gathering back out the branches and all the > tags when I tried it. If you want to start with a smaller repository > like bioperl-network or bioperl-db as the initial cvs2svn conversion > script took quite a long time to run on bioperl-live. Might this been a good opportunity to investigate partitioning bioperl-live into sub-repositories? There has been talk in the past of defining a set of "core" modules separate from other functionally related groups of modules that would be viewed as optional extensions. The goal being to help manage growth and simplify releases. There are currently 892 modules under Bio/. In addition to simplifying the migration to SVN, it would also have other benefits. Say some new functionality or a slew of fixes were added to Bio::Graphics. We could turn around a new Bio::Graphics release quickly without having to work on getting various other parts up to snuff that aren't related to graphics (Biblio, DB, PopGen, Search etc.). Maintenance and releases of the various extensions would be more parallelizable, orchestrated by separate ring leaders. Over time, as a set of functionality matures, it would see fewer updates and there would be less of a need for users to download/install/test it. This could make bioperl easier to customize, extend, and grok in general. Long term, it should ease development and release cycles, but it will involve a bit of near term bullet-biting. We'd need to get clear on how to partition things, including modules, tests, docs, installation logic, etc. and we'd probably need new integration tests to verify that the subsets continue working together. What do folks think? Would this SVN-based, re-partitioned bioperl-live constitute a 2.0 release? Any volunteers to help assemble a roadmap and milestones? Should I go on dreaming? Cheers, Steve From bix at sendu.me.uk Tue Jun 19 03:01:05 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:01:05 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: <46777F31.7030402@sendu.me.uk> Jason Stajich wrote: > The reason it isn't printing anything is someone didn't really write > the implementation quite right. This code was overhauled by Sendu > before the last release I guess something didn't quite get connected. > > I checked in code that has the Bio::Taxon delegating now to a DB > handle for the each_Descendent call. > You can either patch your code or just use the code listed here: > http://bioperl.org/wiki/Module:Bio::DB::Taxonomy I've reverted that change. For some reason the docs for Bio::Taxon::each_Descendent aren't showing up on the website, but they state: --- Note that this method never asks the database for the descendents; it will only return objects you have manually set with add_Descendent(), or where this was done for you by making a Bio::Tree::Tree with this object as an argument to new(). To get the database descendents use $taxon->db_handle->each_Descendent($taxon). --- I also have a note in the Synopsis for the module: --- # Though be careful with each_Descendent - unless you add_Descendent() # yourself, you won't get an answer because unlike for ancestor(), # Bio::Taxon does not ask the database for the answer. You can ask the # database yourself using the same method: ($human) = $homo->db_handle->each_Descendent($homo); --- This is quite deliberate and is to prevent Bad Things from happening. (Can't exactly remember the reasoning now, but I know it was good.) From bix at sendu.me.uk Tue Jun 19 03:41:57 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:41:57 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <467788C5.6070406@sendu.me.uk> Steve Chervitz wrote: > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? There has been talk in the past of > defining a set of "core" modules separate from other functionally > related groups of modules that would be viewed as optional extensions. > The goal being to help manage growth and simplify releases. There are > currently 892 modules under Bio/. > > In addition to simplifying the migration to SVN, it would also have > other benefits. Say some new functionality or a slew of fixes were > added to Bio::Graphics. We could turn around a new Bio::Graphics > release quickly without having to work on getting various other parts > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > Search etc.). Maintenance and releases of the various extensions would > be more parallelizable, orchestrated by separate ring leaders. > > Over time, as a set of functionality matures, it would see fewer > updates and there would be less of a need for users to > download/install/test it. This could make bioperl easier to customize, > extend, and grok in general. > > Long term, it should ease development and release cycles I actually take the opposite view. Breaking things up makes testing and releases more difficult. If one person acts as pumpkin for all the sub-parts, his work-load increases almost linearly with the number of sub-parts. If each sub-part gets its own pumpkin, where do all these pumpkins come from? It seems to me that frequently authors will write modules but inevitably their circumstance changes and they can no longer devote the time to look after them. Having a single pumpkin and 'forcing' him to make sure everything works (regardless of his personal interest in the module) seems more reliable than hoping there will be a person interested enough in each sub-part to handle its release. Since all sub-parts will at the least interact with the 'true' core set of Bioperl modules, they need to be tested and potentially re-released every time the true core is updated. And since some sub-parts will interact with other sub-parts, there will need to be coordinated joint-testing and release of multiple sub-parts. What happens when users report problems? We ask them what version they're running. Right now '1.5.2' means a specific thing, and its trivial for someone to confirm the same problem by installing 1.5.2. What happens when users have to list out all the versions of all the sub-parts they have? Who is going to consistently recreate a users hodge-podge of versions in order to confirm a bug? Won't the advice instead be: "update all versions to the latest and get back to us"? So, as I see it, all sub-parts would best be tested and released with a single new version number every time one sub-part is updated (significantly). In which case, why have sub-parts at all? Keeping things the way they are now means ease of release for the pumpkin and ease of installation for end-users (only one install command to issue to CPAN). Having 'true' sub-parts (each with its own pumpkin), in my fatalistic view, is just going to lead to some useful sub-parts being abandoned and never updated, even where updates may be desirable. Each and every Bio:: module could have been released separately by its respective author. As I see it, one of the main values of 'Bioperl' is that its one (reasonably) consistent collection of modules that lowers the barrier of entry for new Bioinformaticians, giving them extremely easy access to a whole host of functionality with a single install. From hlapp at gmx.net Tue Jun 19 08:47:02 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 08:47:02 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46777F31.7030402@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> So the real mistake was to write my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; instead of my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents ($node); I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the database? If this is correct, can we highlight this in the documentation? It's a small difference that everyone failed to spot. If it is not correct, then maybe we need to revisit the rationale for why a Bio::DB::Taxonomy::get_all_Descendents may not query the underlying database. Also, in my reading of Bio::Taxonomy::Taxon it won't use the database either for ancestor(). Which would be consistent with its other methods. I.e., the bottom line is don't use Node or Taxon objects for hierarchy queries that you expect to use an underlying database, use the Bio::DB::Taxonomy object instead. It makes sense, but is it true? -hilmar On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote: > Jason Stajich wrote: >> The reason it isn't printing anything is someone didn't really write >> the implementation quite right. This code was overhauled by Sendu >> before the last release I guess something didn't quite get connected. >> >> I checked in code that has the Bio::Taxon delegating now to a DB >> handle for the each_Descendent call. >> You can either patch your code or just use the code listed here: >> http://bioperl.org/wiki/Module:Bio::DB::Taxonomy > > I've reverted that change. > > For some reason the docs for Bio::Taxon::each_Descendent aren't > showing > up on the website, but they state: > > --- > Note that this method never asks the database for the descendents; it > will only return objects you have manually set with add_Descendent > (), or > where this was done for you by making a Bio::Tree::Tree with this > object > as an argument to new(). > > To get the database descendents use > $taxon->db_handle->each_Descendent($taxon). > --- > > > I also have a note in the Synopsis for the module: > > --- > # Though be careful with each_Descendent - unless you add_Descendent() > # yourself, you won't get an answer because unlike for ancestor(), > # Bio::Taxon does not ask the database for the answer. You can ask the > # database yourself using the same method: > ($human) = $homo->db_handle->each_Descendent($homo); > --- > > > This is quite deliberate and is to prevent Bad Things from happening. > (Can't exactly remember the reasoning now, but I know it was good.) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Tue Jun 19 09:05:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca> > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated. From bix at sendu.me.uk Tue Jun 19 10:25:26 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 15:25:26 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> Message-ID: <4677E756.6050200@sendu.me.uk> Hilmar Lapp wrote: > So the real mistake was to write > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; > > instead of > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents > ($node); > > I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the > database? Yes, the database object methods use the database. I don't even think it makes sense to question that. What else would it do? > If this is correct, can we highlight this in the documentation? It's > a small difference that everyone failed to spot. The documentation for what? I've already clearly pointed out the gotcha in Bio::Taxon. > Also, in my reading of Bio::Taxonomy::Taxon it won't use the database > either for ancestor(). Which would be consistent with its other methods. Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing with, and it /does/ use the db to get the ancestor, unless the ancestor is manually set (see below for explanation). > I.e., the bottom line is don't use Node or Taxon objects for > hierarchy queries that you expect to use an underlying database, use > the Bio::DB::Taxonomy object instead. It makes sense, but is it true? Almost. It happens to be true but ideally wouldn't be the case. The confusion and problems arise, I guess, because we have two ways to access/create hierarchies and both of them are built from the same building block (Bio::Taxon objects). On the one hand we have Bio::DB::Taxonomy and the other we have Bio::Tree::Tree. Tree objects are easy: you have a Taxon object created in memory for each and every node in the tree. Each Taxon knows its ancestor and descendants by storing references to the relevant Taxon objects in the tree. You 'navigate' through the tree by grabbing a Taxon inside it and asking the Taxon itself for its ancestor or descendant. This leaves us with the Taxon object having the methods ancestor() and each_Descendent(), which we'll expect to work in other circumstances. Bio::DB::Taxonomy returns single Taxon objects from the database on request. Now we still expect our ancestor() and each_Descendent() methods to work, but if things were set up like Bio::Tree::Tree we'd end up pulling the entire database into memory because we'd have to create all the Taxon objects that are ancestors and descendants, recursively, every time we request a single Taxon (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of Bio::DB::Taxonomy::entrez). The solution? We simply don't create the immediate ancestor or descendant Taxon objects of the requested Taxon, and instead implement the Taxon methods to ask the database to create them on demand, if they don't already exist. Well, that idea is fine (and necessary) for the ancestor method, but we run into problems with each_Descendent(). The problem arises when we create Bio::Tree::Tree objects from a Taxon we got from the database. Being able to do that is why Bio::Taxon is shared between them, as it is a very desirable thing to do: you can instantly create a lineage tree for a Taxon of interest and then use all the Bio::Tree::Tree methods on it. Unfortunately one of those methods is get_nodes() which is implemented using each_Descendent() and get_all_Descendents(). If each_Descendent() asked the database for the real answer, we'd end up pulling the entire database into the tree. So my implementation was to not ask the database and just warn people in the docs. Ideally it /would/ use the database, because that's what a user would expect. Can anyone see an alternate way around the problem? From hlapp at gmx.net Tue Jun 19 12:14:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 12:14:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <4677E756.6050200@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: Sorry I was accidentally looking at an older branch. Reading through the Taxon module I get more confused though than would leave me at ease. Here's what I understand of your description of the problem: - We would like nodes returned from Bio::DB::Taxonomy to use the database for all hierarchical queries. - We would like nodes used in a Bio::Tree::Tree not to use the database for any hierarchical query. What I understand that we have is - Taxon node objects that have a db_handle set will use the database for ancestor(), unless it has been set manually (?), but not for each_Descendent(). - Taxon node objects that don't have a db_handle set won't use a database but will function normally otherwise. - This is needed to prevent Bio::Tree::Tree methods from pulling the entire tree into memory. If this is correct (I'm not sure it is), it sounds like we want to temporarily divorce taxonomy nodes from their database capabilities while they are being queried in a tree context? I'm still trying to understand - if I create a Bio::Tree::Tree from a single node, will the tree automatically contain all nodes along the lineage of ancestors up to the root? So, even if extracting this lineage involved querying a database it would be acceptable, but not for querying descendents? It sounds to me like what is needed is that nodes that get added to a tree need to be stripped of their database capabilities. This could be achieved by creating a wrapper class that delegates all non- hierarchical methods to the wrapped Taxon object, and overriding all hierarchical queries to not use a database. I'm not sure I fully understand yet though, but the inconsistent behavior will be sure to throw people off track. -hilmar On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> So the real mistake was to write >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >> >get_all_Descendents; >> instead of >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $db- >> >get_all_Descendents ($node); >> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask >> the database? > > Yes, the database object methods use the database. I don't even > think it makes sense to question that. What else would it do? > > >> If this is correct, can we highlight this in the documentation? >> It's a small difference that everyone failed to spot. > > The documentation for what? I've already clearly pointed out the > gotcha in Bio::Taxon. > > >> Also, in my reading of Bio::Taxonomy::Taxon it won't use the >> database either for ancestor(). Which would be consistent with >> its other methods. > > Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're > dealing with, and it /does/ use the db to get the ancestor, unless > the ancestor is manually set (see below for explanation). > > >> I.e., the bottom line is don't use Node or Taxon objects for >> hierarchy queries that you expect to use an underlying database, >> use the Bio::DB::Taxonomy object instead. It makes sense, but is >> it true? > > Almost. It happens to be true but ideally wouldn't be the case. The > confusion and problems arise, I guess, because we have two ways to > access/create hierarchies and both of them are built from the same > building block (Bio::Taxon objects). > > On the one hand we have Bio::DB::Taxonomy and the other we have > Bio::Tree::Tree. > > Tree objects are easy: you have a Taxon object created in memory > for each and every node in the tree. Each Taxon knows its ancestor > and descendants by storing references to the relevant Taxon objects > in the tree. You 'navigate' through the tree by grabbing a Taxon > inside it and asking the Taxon itself for its ancestor or descendant. > > This leaves us with the Taxon object having the methods ancestor() > and each_Descendent(), which we'll expect to work in other > circumstances. > > Bio::DB::Taxonomy returns single Taxon objects from the database on > request. Now we still expect our ancestor() and each_Descendent() > methods to work, but if things were set up like Bio::Tree::Tree > we'd end up pulling the entire database into memory because we'd > have to create all the Taxon objects that are ancestors and > descendants, recursively, every time we request a single Taxon > (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and > slow/not allowed in the case of Bio::DB::Taxonomy::entrez). > > The solution? We simply don't create the immediate ancestor or > descendant Taxon objects of the requested Taxon, and instead > implement the Taxon methods to ask the database to create them on > demand, if they don't already exist. Well, that idea is fine (and > necessary) for the ancestor method, but we run into problems with > each_Descendent(). > > The problem arises when we create Bio::Tree::Tree objects from a > Taxon we got from the database. Being able to do that is why > Bio::Taxon is shared between them, as it is a very desirable thing > to do: you can instantly create a lineage tree for a Taxon of > interest and then use all the Bio::Tree::Tree methods on it. > Unfortunately one of those methods is get_nodes() which is > implemented using each_Descendent() and get_all_Descendents(). If > each_Descendent() asked the database for the real answer, we'd end > up pulling the entire database into the tree. > > So my implementation was to not ask the database and just warn > people in the docs. Ideally it /would/ use the database, because > that's what a user would expect. Can anyone see an alternate way > around the problem? -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cain.cshl at gmail.com Tue Jun 19 14:41:52 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Tue, 19 Jun 2007 14:41:52 -0400 Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug? In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL> References: <18039.61086.829726.809888@gargle.gargle.HOWL> Message-ID: <1182278512.2592.42.camel@localhost.localdomain> Hi Alessandra, I cc'ed your message to the bioperl and sequence ontology mailing lists, since your question is relevant to both. Converting genbank files to GFF3 is excruciatingly difficult; I generally find that I can use the genbank2gff3 script to get me most of the way there, but then I need to do some manual fixing to make it 'right'. I am using bioperl-live, since there have been several fixes to the script since bioperl 1.5.2 was released, including the most recent fixes from me today (when I started working on this); I would suggest you use bioperl-live as well. I ran the script on chrY. Most (perhaps all) of the errors fit into a few categories: - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have a phase. Since it can be a little bit of a hassle to calculate, I understand why it was left out, but I'll submit a bug report to have those calculated. If you are planning on loading the GFF file into Chado, you can use the --noCDS option to get exons instead of CDSes, which makes the problem go away (the validator has a bug here though--it reports the polypeptide derives_from mRNA as invalid, but it is correct; I'm reporting that directly to the author). Here's the bioperl bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2322 - "invalid type pair" is caused by the genbank file using feature types in a way that conflicts with the Sequence Ontology. For example, it has STS features that are part_of a gene, pseudogenic_region as part_of pseudogene. I don't know if there would be an easy way to catch this in the conversion script. You may need to fix these by hand. If the problems occur for features that you don't care about, you can use the --filter option to leave them out of the resulting GFF file (for example, adding '--filter STS' would leave all STS features out of the file). Also, if you don't plan on loading these into Chado (which does require SO-compliance) but instead plan on using a Bio::DB::SeqFeature database, these errors may not be a problem. - "invalid type" is caused by feature types that are not in SOFA (Sequence Ontology for Feature Annotation), though the terms probably are in SO. I thought at one point we discussed allowing any SO type to appear in the GFF3 type column, but that is not what the spec says now. I don't see this type of error as causing a problem for either Bio::DB::SeqFeature or Chado. Chado allows features to be typed with anything that is in SO and does not restrict to SOFA. Scott On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote: > Hi all, > > I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about > human genbank file. I used validate_gff3 on line with human.gff and > it has id non-unique so the database gbrowse inserting has errors. > > I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that > I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens > Elements having id non-unique are: > - CDS or pseudo*exon without mRNA and parent > - STS with egual start and end > - tRNA with egual name > > If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl? > If I'm mistaken, can you help me? > > Thanks very much for the help in advance, > > Alessandra. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sac at bioperl.org Tue Jun 19 14:54:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Tue, 19 Jun 2007 11:54:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <467788C5.6070406@sendu.me.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Valid points, Sendu. I wonder if there might be a best-of-both-worlds approach here. I would not be advocating for a major slice and dice, but just identifying a few large, reasonably well established and encapsulated blocks of functionality that could be managed more independently and segregating them away from the rest. For example: DB, Graphics, Search+SearchIO, Tools. Once per year, we could have a "whole caboodle" release where the core and all sub parts are tested and released as a group, as we currently do. Then, updates to the sub parts can occur as-needed but without necessarily involving updates to other sub parts or the core. The onus would be on the pumpkin for the sub part release to make sure it continues to work with the last whole caboodle release. This would minimize the number of release clashes, since sub part updates would only be sanctioned relative to the last caboodle release, and it would ensure that the whole set continues to interoperate. Perhaps it would be worth experimenting with such an approach so we can judge it based on actual experience. We could identify one functional sub part and segregate it out, do a release cycle or two, along with a sub part release, and decide if this makes things easier or harder, for devs as well as users. We could always bring it back into the fold if it doesn't work out. My fear is that as bioperl continues to grow, the monolithic approach will become increasingly onerous for a single release pumpkin to manage, and harder to find someone who feels up to the task. It could also discourage new developers from diving into the codebase if it looks too deep. And they are our lifeblood. A more functionally segregated bioperl codebase could lower the activation energy needed to recruit release pumpkins and new devs, leading to more release iterations, fewer bugs, more features, and more sustainable growth. When I first discovered Bioperl in 1996, it had three modules. At ~900, I probably wouldn't have joined ranks as a developer (well, I probably would, but it would have taken a while to digest it and become a contributor). Steve On 6/19/07, Sendu Bala wrote: > Steve Chervitz wrote: > > Might this been a good opportunity to investigate partitioning > > bioperl-live into sub-repositories? There has been talk in the past of > > defining a set of "core" modules separate from other functionally > > related groups of modules that would be viewed as optional extensions. > > The goal being to help manage growth and simplify releases. There are > > currently 892 modules under Bio/. > > > > In addition to simplifying the migration to SVN, it would also have > > other benefits. Say some new functionality or a slew of fixes were > > added to Bio::Graphics. We could turn around a new Bio::Graphics > > release quickly without having to work on getting various other parts > > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > > Search etc.). Maintenance and releases of the various extensions would > > be more parallelizable, orchestrated by separate ring leaders. > > > > Over time, as a set of functionality matures, it would see fewer > > updates and there would be less of a need for users to > > download/install/test it. This could make bioperl easier to customize, > > extend, and grok in general. > > > > Long term, it should ease development and release cycles > > I actually take the opposite view. Breaking things up makes testing and > releases more difficult. > > If one person acts as pumpkin for all the sub-parts, his work-load > increases almost linearly with the number of sub-parts. If each sub-part > gets its own pumpkin, where do all these pumpkins come from? It seems to > me that frequently authors will write modules but inevitably their > circumstance changes and they can no longer devote the time to look > after them. Having a single pumpkin and 'forcing' him to make sure > everything works (regardless of his personal interest in the module) > seems more reliable than hoping there will be a person interested enough > in each sub-part to handle its release. > > Since all sub-parts will at the least interact with the 'true' core set > of Bioperl modules, they need to be tested and potentially re-released > every time the true core is updated. And since some sub-parts will > interact with other sub-parts, there will need to be coordinated > joint-testing and release of multiple sub-parts. > > What happens when users report problems? We ask them what version > they're running. Right now '1.5.2' means a specific thing, and its > trivial for someone to confirm the same problem by installing 1.5.2. > What happens when users have to list out all the versions of all the > sub-parts they have? Who is going to consistently recreate a users > hodge-podge of versions in order to confirm a bug? Won't the advice > instead be: "update all versions to the latest and get back to us"? > > So, as I see it, all sub-parts would best be tested and released with a > single new version number every time one sub-part is updated > (significantly). In which case, why have sub-parts at all? Keeping > things the way they are now means ease of release for the pumpkin and > ease of installation for end-users (only one install command to issue to > CPAN). Having 'true' sub-parts (each with its own pumpkin), in my > fatalistic view, is just going to lead to some useful sub-parts being > abandoned and never updated, even where updates may be desirable. > > Each and every Bio:: module could have been released separately by its > respective author. As I see it, one of the main values of 'Bioperl' is > that its one (reasonably) consistent collection of modules that lowers > the barrier of entry for new Bioinformaticians, giving them extremely > easy access to a whole host of functionality with a single install. > From bix at sendu.me.uk Tue Jun 19 15:13:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:13:39 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <46782AE3.2090703@sendu.me.uk> Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. [snip] You haven't convinced me, but I'd go along with the majority decision if best-of-both-worlds was picked. > DB, Graphics, Search+SearchIO, Tools. I will, however, say that DB interleaves into too many core modules. It should stay in core. Tools? Its hardly touched anyway, so I don't see the value of taking it out, what with Bio::Tools::Run already being its own package. Most Bioperl users probably get Bioperl just to do something Blast related, so all Blast stuff really ought to stay in core. Graphics is an obvious choice and I agree. Updated frequently, and has its own release needs. It also has some of the trickier dependencies, so would make installing core simpler. I can imagine plucking Search+SearchIO out, and its something that needs regular updating. Another good candidate. > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. Well, we already have the run package. Its a split-off subpart that gets updated. The only 'experiment' left to do is finding it its own pumpkin. From bix at sendu.me.uk Tue Jun 19 15:48:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:48:50 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: <46783322.30309@sendu.me.uk> Hilmar Lapp wrote: > Here's what I understand of your description of the problem: > > - We would like nodes returned from Bio::DB::Taxonomy to use the > database for all hierarchical queries. > > - We would like nodes used in a Bio::Tree::Tree not to use the > database for any hierarchical query. Correct. > What I understand that we have is > > - Taxon node objects that have a db_handle set will use the database > for ancestor(), unless it has been set manually (?), but not for > each_Descendent(). > > - Taxon node objects that don't have a db_handle set won't use a > database but will function normally otherwise. > > - This is needed to prevent Bio::Tree::Tree methods from pulling the > entire tree into memory. Correct. > If this is correct (I'm not sure it is), it sounds like we want to > temporarily divorce taxonomy nodes from their database capabilities > while they are being queried in a tree context? Yes. > I'm still trying to understand - if I create a Bio::Tree::Tree from a > single node, will the tree automatically contain all nodes along the > lineage of ancestors up to the root? So, even if extracting this > lineage involved querying a database it would be acceptable, but not > for querying descendents? Yes. Asking the database for all the ancestors up to root only pulls a couple of nodes into the tree and is exactly what the user would want to happen. But if nodes are allowed to get their descendants from the database, when we get the root node from the database, we'd get all the root's descendants, and then for each of those we'd get all /their/ descendants... that's when the whole db gets sucked in. > It sounds to me like what is needed is that nodes that get added to a > tree need to be stripped of their database capabilities. This could > be achieved by creating a wrapper class that delegates all non- > hierarchical methods to the wrapped Taxon object, and overriding all > hierarchical queries to not use a database. I'm not sure I fully > understand yet though, but the inconsistent behavior will be sure to > throw people off track. When we're making a tree from a db Taxon we need db access to find all the ancestors; we just don't want to get any descendants outside our initiating Taxon's direct lineage. my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens'); my @ranks = qw(superkingdom class order genus species); my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names, -ranks => \@ranks); @names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus'); $db->add_lineage(-names => \@names, -ranks => \@ranks); my $homo = $db->get_taxon(-name => 'Homo'); isa_ok($homo, 'Bio::Taxon'); # PASS is $homo->ancestor->scientific_name, 'Primates' # PASS my @descs = $homo->each_Descendent; is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node my $lineage = Bio::Tree::Tree->new(-node => $homo); is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS my @nodes = $lineage->get_nodes; ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8 (on that last test I can't remember if the answer might actually be 5 because our lineage does contain 'Homo sapiens') If anyone can figure out how to get all those to pass, please let me know. From cjfields at uiuc.edu Tue Jun 19 17:15:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 16:15:00 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. I would not be advocating for a major slice and dice, > but just identifying a few large, reasonably well established and > encapsulated blocks of functionality that could be managed more > independently and segregating them away from the rest. For example: > DB, Graphics, Search+SearchIO, Tools. There should also be a consensus between the core devs on this; I don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing their opinions as it will directly impact projects which rely on core functionality (GBrowse/GMOD, bioperl-db, etc). I also agree with George that this should be postponed until after svn issues are taken care of. Stating that, I think this is a good idea in general, though we'll need to be careful which ones we segregate out as non-core. I agree with your choices; I would add in Bio::Restriction, Bio::Assembly, Bio::Structure, and a few more. As long as the distribution required installation of 'core' prior to test runs it shouldn't be too much of a problem. In order for this to work we would need to delineate what defines 'core' (how broad the definition should be), then identify those modules that don't fit and decide what to do with them. Would we want to split the others into separate packages or lump together as a bioperl-auxiliary (horrid name, but you get my point)? Too many could be a logistical nightmare, as Sendu has pointed out. > Once per year, we could have a "whole caboodle" release where the core > and all sub parts are tested and released as a group, as we currently > do. Then, updates to the sub parts can occur as-needed but without > necessarily involving updates to other sub parts or the core. Sounds fine by me. Actually, my thought was we could reimplement Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted) to install all the necessary subpackages in order to emulate an old- style 'core' installation, or act as an 'install everything BioPerl- related' Bundle. Regular updates of the subpackages to CPAN should just require updating the Bundle (which would update only the relevant parts, at least I believe it would). > The onus would be on the pumpkin for the sub part release to make sure > it continues to work with the last whole caboodle release. This would > minimize the number of release clashes, since sub part updates would > only be sanctioned relative to the last caboodle release, and it would > ensure that the whole set continues to interoperate. > > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. We could always bring it back > into the fold if it doesn't work out. > > My fear is that as bioperl continues to grow, the monolithic approach > will become increasingly onerous for a single release pumpkin to > manage, and harder to find someone who feels up to the task. It could > also discourage new developers from diving into the codebase if it > looks too deep. And they are our lifeblood. Agreed! > A more functionally segregated bioperl codebase could lower the > activation energy needed to recruit release pumpkins and new devs, > leading to more release iterations, fewer bugs, more features, and > more sustainable growth. 'Activation energy.' Hmm. Spoken like a true biologist. > When I first discovered Bioperl in 1996, it had three modules. At > ~900, I probably wouldn't have joined ranks as a developer (well, I > probably would, but it would have taken a while to digest it and > become a contributor). > > Steve I pretty much agree, though this will require quite a bit more discussion. chris From hlapp at gmx.net Tue Jun 19 17:57:54 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 17:57:54 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > There should also be a consensus between the core devs on this; I > don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing > their opinions The problem I have increasingly had with BioPerl (aside from the fact that it's written in Perl ;) is the plethora of dependencies I need to install, not the number of modules. But every time I've been told that that's what Perl is all about, and I should shut up and install the bundle. Idiosyncratically I don't like bundles that clutter up my hard disk with stuff I'll never use, and in this sense if BioPerl is divided into 10 packages I will have to think about each one whether I need it, and do a separate CVS checkout - and regular update - of each one (though granted, I believe there are ways the multiple checkout and update thing can be taken care of). In reality, this may be a rapidly disappearing trait though of those who have grown up in a time when they proudly spent all their savings to buy that new computer because it had a 20MB hard disk, compared to the two 360k floppy drives the previous one had. So don't ask me, just don't make it too hard for the dinosaurs. > as it will directly impact projects which rely on core > functionality (GBrowse/GMOD, bioperl-db, etc). Well, I hope there are ways to limit that? > I also agree with George that this should be postponed until after > svn issues are taken care of. I agree entirely. Please don't throw this in the same bin or tie one to the other. The migration is neither easier nor faster nor better testable with a partitioned BioPerl. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Jun 19 21:48:20 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 20:48:20 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > >> There should also be a consensus between the core devs on this; I >> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >> their opinions > > The problem I have increasingly had with BioPerl (aside from the fact > that it's written in Perl ;) is the plethora of dependencies I need > to install, not the number of modules. > > But every time I've been told that that's what Perl is all about, and > I should shut up and install the bundle. Idiosyncratically I don't > like bundles that clutter up my hard disk with stuff I'll never use, > and in this sense if BioPerl is divided into 10 packages I will have > to think about each one whether I need it, and do a separate CVS > checkout - and regular update - of each one (though granted, I > believe there are ways the multiple checkout and update thing can be > taken care of). I agree; the fewer dependencies the better. We could divide it up into a small, focused core package with only a few dependencies, and 1-3 more containing the focused bits which require the most maintenance (Graphics, SearchIO/Tools, etc). I worry about having too many more. > In reality, this may be a rapidly disappearing trait though of those > who have grown up in a time when they proudly spent all their savings > to buy that new computer because it had a 20MB hard disk, compared to > the two 360k floppy drives the previous one had. > > So don't ask me, just don't make it too hard for the dinosaurs. There would need to be some way of getting an old-style full-blown core installation regardless of how many subdistros we would divy core up into. My thought for CPAN was having Bundle::BioPerl take over this but I'm not sure if it's still being used. Maybe there are other ways for svn/cvs. >> as it will directly impact projects which rely on core >> functionality (GBrowse/GMOD, bioperl-db, etc). > > Well, I hope there are ways to limit that? I believe so, yes, particularly for bioperl-db. I would think splitting off Bio::Graphics or Bio::DB* will have some effect on GBrowse/GFF. >> I also agree with George that this should be postponed until after >> svn issues are taken care of. > > I agree entirely. Please don't throw this in the same bin or tie one > to the other. The migration is neither easier nor faster nor better > testable with a partitioned BioPerl. > > -hilmar We def. have to complete transition to subversion first, then think about this some more. chris From n.haigh at sheffield.ac.uk Wed Jun 20 02:31:24 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 07:31:24 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: <4678C9BC.10206@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > >> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: >> >>> There should also be a consensus between the core devs on this; I >>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >>> their opinions >> The problem I have increasingly had with BioPerl (aside from the fact >> that it's written in Perl ;) is the plethora of dependencies I need >> to install, not the number of modules. >> >> But every time I've been told that that's what Perl is all about, and >> I should shut up and install the bundle. Idiosyncratically I don't >> like bundles that clutter up my hard disk with stuff I'll never use, >> and in this sense if BioPerl is divided into 10 packages I will have >> to think about each one whether I need it, and do a separate CVS >> checkout - and regular update - of each one (though granted, I >> believe there are ways the multiple checkout and update thing can be >> taken care of). > > I agree; the fewer dependencies the better. We could divide it up > into a small, focused core package with only a few dependencies, and > 1-3 more containing the focused bits which require the most > maintenance (Graphics, SearchIO/Tools, etc). I worry about having > too many more. > >> In reality, this may be a rapidly disappearing trait though of those >> who have grown up in a time when they proudly spent all their savings >> to buy that new computer because it had a 20MB hard disk, compared to >> the two 360k floppy drives the previous one had. >> >> So don't ask me, just don't make it too hard for the dinosaurs. > > There would need to be some way of getting an old-style full-blown > core installation regardless of how many subdistros we would divy > core up into. My thought for CPAN was having Bundle::BioPerl take > over this but I'm not sure if it's still being used. Maybe there are > other ways for svn/cvs. Personally, I think this use of Bundle::Bioperl is more in line with what CPAN Bundles were meant to do - "a bundle is a collection of modules that comprise a cohesive unit". Under that definition you could probably put the whole of Bioperl but I won't go there! When a package is updated and a new release is made, this should be installable/updatable via cpan as well as updating the bundle with the correct version. This was you can get all of Bioperl via the bundle, or just install the sub-packages on their own. If the switch over to svn takes place, will all the Bioperl-* projects move over at the same time? If so, will they go into their own svn repository or into the same one? Since with svn you can checkout any subtree of the repository I'm not clear on the pro's and cons of either of these options. Am I right in thinking that there is a way for cvs to define a "project" such that when you checkout that "project" it actually checks out multiple projects behind the scene? I'm sure I've seen this somewhere, possibly when the project is dependent on some 3rd party code that is also in cvs. If this is possible, I'm sure it will also be possible with svn. This could then allow something like the following to happen after the split up of Bioperl. The following projects could be defined: bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" called "bioperl" would actually checkout the real projects call bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems that this ought to be possible, doesn't it? > >>> as it will directly impact projects which rely on core >>> functionality (GBrowse/GMOD, bioperl-db, etc). >> Well, I hope there are ways to limit that? > > I believe so, yes, particularly for bioperl-db. I would think > splitting off Bio::Graphics or Bio::DB* will have some effect on > GBrowse/GFF. > >>> I also agree with George that this should be postponed until after >>> svn issues are taken care of. >> I agree entirely. Please don't throw this in the sam. e bin or tie one >> to the other. The migration is neither easier nor faster nor better >> testable with a partitioned BioPerl. >> >> -hilmar > > We def. have to complete transition to subversion first, then think > about this some more. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4 op9sQTZyeK6G6taFhTAPMYc= =7NRw -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 07:46:16 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 07:46:16 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? They are under the same CVSROOT right now. Locking down some sub- repositories but not others may be odd or impossible. > If so, will they go into their own svn repository or into the same > one? Good question, I'm not sure about the pros and cons one way or the other either. The fewer repositories the less sysadmin work in fine- graining permissions. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8 Ims4d150lsX0vXtDwGI1lKg= =K4++ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Wed Jun 20 07:57:22 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 12:57:22 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> Message-ID: <46791622.6080409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > >> If the switch over to svn takes place, will all the Bioperl-* projects >> move over at the same time? > > They are under the same CVSROOT right now. Locking down some > sub-repositories but not others may be odd or impossible. > >> If so, will they go into their own svn repository or into the same one? > > Good question, I'm not sure about the pros and cons one way or the other > either. The fewer repositories the less sysadmin work in fine-graining > permissions. > > -hilmar > I don't think there is any major reason why the following single repos wouldn't do the trick: /-- |-bioperl-live | |--- trunk | |--- branches | |--- tags | |-bioperl-run |--- trunk |--- branches |--- tags Any reason why this couldn't be used? I know some people don't like the idea of the revision number incrementing for the whole repository if it contains several "projects". However, revision numbers are really only a way for svn to keep track of things and a very large revision number shouldn't really "upset" anyone. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA 1Vj8BSUnanpdjYYLE6eGanU= =bOqK -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 08:08:33 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 08:08:33 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? That would work fine except that there are several more sub-projects (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). That should still be fine. I think what needs to be recognized is the limitations it puts on permission granularity. If it's all the same repository (as is now) then having commit rights to one (subproject) will mean commit rights to all. From my perspective that's fine, it has worked great so far. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1 hckjT7LBtHcmwGI8B+BKQIM= =gYfA -----END PGP SIGNATURE----- From hartzell at alerce.com Tue Jun 19 15:53:39 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 19 Jun 2007 12:53:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <18040.13379.217277.992742@almost.alerce.com> Steve Chervitz writes: > On 6/16/07, Jason Stajich wrote: > > [...] > > Just to say I already went through all the steps of running cvs2svn > > myself and had problems gathering back out the branches and all the > > tags when I tried it. If you want to start with a smaller repository > > like bioperl-network or bioperl-db as the initial cvs2svn conversion > > script took quite a long time to run on bioperl-live. > > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? [...] I'd say that the time to do this kind of rearrangement would be *after* the svn repo's set up. That way you'll be able to track stuff back through to the beginning of time. g. From sdavis2 at mail.nih.gov Wed Jun 20 08:44:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 20 Jun 2007 08:44:08 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <46792118.4030205@mail.nih.gov> Hilmar Lapp wrote: > > On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > >> I don't think there is any major reason why the following single repos >> wouldn't do the trick: > >> /-- >> |-bioperl-live >> | |--- trunk >> | |--- branches >> | |--- tags >> | >> |-bioperl-run >> |--- trunk >> |--- branches >> |--- tags > >> Any reason why this couldn't be used? > > That would work fine except that there are several more sub-projects > (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). > > That should still be fine. I think what needs to be recognized is the > limitations it puts on permission granularity. If it's all the same > repository (as is now) then having commit rights to one (subproject) > will mean commit rights to all. From my perspective that's fine, it > has worked great so far. Actually, I think there are ways of creating per-directory access control. See here: http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general With Apache-based https access, such access control is relatively straightforward, it appears. With the standalone svn server over ssh, one needs to use "commit hook scripts" to limit access. But I think it is possible (admitting that I have not tried to do this...). Sean From hartzell at alerce.com Wed Jun 20 09:23:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:23:32 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <18041.10836.728079.835572@almost.alerce.com> Nathan S. Haigh writes: > [...] > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? If so, will they go into their own svn > repository or into the same one? Since with svn you can checkout any > subtree of the repository I'm not clear on the pro's and cons of either > of these options. I'm planning to drop the projects from the top of the CVSROOT into a single svn repository: bioperl-ext bioperl-pipeline biodata bioperl-gui bioperl-run bioperl-cookbook bioperl-live biosql-schema bioperl-corba-client bioperl-microarray html bioperl-corba-server bioperl-network task-manager bioperl-das-client bioperl-papers xml-html bioperl-db bioperl-pedigree although that's open to feedback from the core members. As a progress report, I've built a demo repos with -run, -ext, and -live in it and asked a couple of folks to to take a peek at it. When I get a bit further along I'll figure out how to get something for the public to test. > Am I right in thinking that there is a way for cvs to define a "project" > such that when you checkout that "project" it actually checks out > multiple projects behind the scene? I'm sure I've seen this somewhere, > possibly when the project is dependent on some 3rd party code that is > also in cvs. If this is possible, I'm sure it will also be possible with > svn. This could then allow something like the following to happen after > the split up of Bioperl. The following projects could be defined: > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > called "bioperl" would actually checkout the real projects call > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > that this ought to be possible, doesn't it? > [...] I don't think that there's any functionality like that in svn. g. From hartzell at alerce.com Wed Jun 20 09:26:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:26:04 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <18041.10988.375946.833182@almost.alerce.com> Nathan S. Haigh writes: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hilmar Lapp wrote: > > > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > > > >> If the switch over to svn takes place, will all the Bioperl-* projects > >> move over at the same time? > > > > They are under the same CVSROOT right now. Locking down some > > sub-repositories but not others may be odd or impossible. > > > >> If so, will they go into their own svn repository or into the same one? > > > > Good question, I'm not sure about the pros and cons one way or the other > > either. The fewer repositories the less sysadmin work in fine-graining > > permissions. > > > > -hilmar > > > > > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? > [...] That's exactly the way that I'm setting it up. g. From n.haigh at sheffield.ac.uk Wed Jun 20 09:33:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 14:33:33 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <18041.10836.728079.835572@almost.alerce.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <18041.10836.728079.835572@almost.alerce.com> Message-ID: <46792CAD.5060700@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Nathan S. Haigh writes: > > [...] > > If the switch over to svn takes place, will all the Bioperl-* projects > > move over at the same time? If so, will they go into their own svn > > repository or into the same one? Since with svn you can checkout any > > subtree of the repository I'm not clear on the pro's and cons of either > > of these options. > > I'm planning to drop the projects from the top of the CVSROOT into a > single svn repository: > > bioperl-ext bioperl-pipeline biodata bioperl-gui > bioperl-run bioperl-cookbook bioperl-live biosql-schema > bioperl-corba-client bioperl-microarray html bioperl-corba-server > bioperl-network task-manager bioperl-das-client bioperl-papers > xml-html bioperl-db bioperl-pedigree > > although that's open to feedback from the core members. > > As a progress report, I've built a demo repos with -run, -ext, and > -live in it and asked a couple of folks to to take a peek at it. When > I get a bit further along I'll figure out how to get something for the > public to test. Could I take a peek?? > > > Am I right in thinking that there is a way for cvs to define a "project" > > such that when you checkout that "project" it actually checks out > > multiple projects behind the scene? I'm sure I've seen this somewhere, > > possibly when the project is dependent on some 3rd party code that is > > also in cvs. If this is possible, I'm sure it will also be possible with > > svn. This could then allow something like the following to happen after > > the split up of Bioperl. The following projects could be defined: > > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > > called "bioperl" would actually checkout the real projects call > > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > > that this ought to be possible, doesn't it? > > [...] > > I don't think that there's any functionality like that in svn. I did come across this which might help: http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561 Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su sWDAmqFhGgtlyeawaIGSV14= =zeAY -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 20 11:38:20 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 20 Jun 2007 16:38:20 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm Message-ID: <467949EC.9040100@sendu.me.uk> In considering updating all the test scripts to take advantage of the new network option, and/or reimplementing them in Test::More, I thought now would be a good time to standardize all the test scripts and reduce the possibility of having to alter them all in the future if something changes. For example we could decide on an alternate way of choosing to run network tests, or a new way of deciding to output debug information. There are also some inconsistencies in the messages produced by tests skipping all, and even an unfortunate mistake that has been copy/pasted through a lot of test scripts. My solution is t/lib/BioperlTest.pm (documented with perldoc) We go from this: ---- use strict; our $DEBUG; BEGIN { $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; eval { require Test::More; }; if( $@ ) { use lib 't/lib'; } use Test::More; # the mistake! use Module::Build; my $build = Module::Build->current(); my $do_network_tests = $build->notes('network'); eval { require IO::String; require LWP; require LWP::UserAgent; }; if ($@) { plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed. This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests'; } elsif (!$do_network_tests) { plan skip_all => 'Network tests have not been requested, skipping all'; } else { plan tests => 21; } #... } my $obj = Bio::Object->new(-verbose => $DEBUG); #... ---- To this: ---- use strict; BEGIN { use lib 't/lib'; use BioperlTest; test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)], -requires_networking => 1, -tests => 21); #... } my $obj = Bio::Object->new(-verbose => test_debug()); #... ---- Can anyone identify problems with this approach? Is the interface presented by BioperlTest flexible enough that any changes would only be additions for new functionality (and therefore all test scripts wouldn't need to be altered)? Is BioperlTest missing anything you'd like? Are there any objections to me updating all tests in this manner? For an example, see t/RemoteBlast.t Cheers, Sendu. From spiros at lokku.com Wed Jun 20 11:49:48 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Wed, 20 Jun 2007 16:49:48 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> Message-ID: Yep, they are not all done. Some still need to be ported over, doing some here and there at home. However, the recent email Sendu sent, the one about abstracting the setup of testing is actually something i was thinking myself so it might be a better way to tackle the problem. For once it would save us from duplicating the same 30 lines of code across all tests. As far as network tests are involved, ive always been an avid hater of them. I believe they only bring more troubles than what they contribute due to the diversity of setups people have. My way of tackling them was always to group all the tests that required live access into one file and then forcibly just run that - iff needed and not by default. Like i said, thats just my opinion, ive been bitten by them one time too many. Spiros On 6/18/07, Chris Fields wrote: > > On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > > > Chris Fields wrote: > >> Couldn't you enable BIOPERLDEBUG, disable network access, then > >> iterate through tests checking for those which fail or skip? > > > > Yes, good idea, though my dev machine is also my email/webserver so > > I'd rather come up with an alternate solution than one involving > > 'disable network access'. > > > > Still, that's what I'll probably end up doing. Cheers! > > > > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > > to wait for you to finish, or join in? If you're not going to have > > time to do any more in the next few weeks, can you please update > > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > > in the opposite case, add your name in)? Its not quite clear to me > > which tests are assigned to whom. Can someone clarify what the > > markings mean? > > > > Cheers, > > Sendu. > > Not sure how far along spiros is; I handed it over after I finished > up to the 'Q' tests. In general the ones marked out have been > converted over, ones with names next to them have been claimed. If > you need help I'll prob. start back up again to finish them off; we > just need to divy them up. > > chris > From hlapp at gmx.net Wed Jun 20 12:27:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 12:27:47 -0400 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: Very cool! Sounds like a no-brainer to me to adopt this in all the tests. -hilmar On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > In considering updating all the test scripts to take advantage of the > new network option, and/or reimplementing them in Test::More, I > thought > now would be a good time to standardize all the test scripts and > reduce > the possibility of having to alter them all in the future if something > changes. > > For example we could decide on an alternate way of choosing to run > network tests, or a new way of deciding to output debug information. > There are also some inconsistencies in the messages produced by tests > skipping all, and even an unfortunate mistake that has been copy/ > pasted > through a lot of test scripts. > > My solution is t/lib/BioperlTest.pm (documented with perldoc) > > We go from this: > > ---- > use strict; > our $DEBUG; > > BEGIN { > $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > > eval { require Test::More; }; > if( $@ ) { > use lib 't/lib'; > } > use Test::More; # the mistake! > > use Module::Build; > my $build = Module::Build->current(); > my $do_network_tests = $build->notes('network'); > > eval { > require IO::String; > require LWP; > require LWP::UserAgent; > }; > if ($@) { > plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > installed. > This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > tests'; > } > elsif (!$do_network_tests) { > plan skip_all => 'Network tests have not been requested, skipping > all'; > } > else { > plan tests => 21; > } > > #... > } > > my $obj = Bio::Object->new(-verbose => $DEBUG); > #... > ---- > > To this: > > ---- > use strict; > > BEGIN { > use lib 't/lib'; > use BioperlTest; > > test_begin(-requires_modules => [qw(IO::String LWP > LWP::UserAgent)], > -requires_networking => 1, > -tests => 21); > > #... > } > > my $obj = Bio::Object->new(-verbose => test_debug()); > #... > ---- > > > Can anyone identify problems with this approach? Is the interface > presented by BioperlTest flexible enough that any changes would > only be > additions for new functionality (and therefore all test scripts > wouldn't > need to be altered)? Is BioperlTest missing anything you'd like? > > Are there any objections to me updating all tests in this manner? > For an > example, see t/RemoteBlast.t > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 20 12:44:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 11:44:01 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: Agreed! You've already created an example case so there's something to go off of. I plan on changing some EUtilities tests soon so I'll try implementing this, basing off your RemoteBlast.t implementation. Seems clear enough on the surface; if I run into problems I'll post. chris On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > Very cool! Sounds like a no-brainer to me to adopt this in all the > tests. -hilmar > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > >> In considering updating all the test scripts to take advantage of the >> new network option, and/or reimplementing them in Test::More, I >> thought >> now would be a good time to standardize all the test scripts and >> reduce >> the possibility of having to alter them all in the future if >> something >> changes. >> >> For example we could decide on an alternate way of choosing to run >> network tests, or a new way of deciding to output debug information. >> There are also some inconsistencies in the messages produced by tests >> skipping all, and even an unfortunate mistake that has been copy/ >> pasted >> through a lot of test scripts. >> >> My solution is t/lib/BioperlTest.pm (documented with perldoc) >> >> We go from this: >> >> ---- >> use strict; >> our $DEBUG; >> >> BEGIN { >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; >> >> eval { require Test::More; }; >> if( $@ ) { >> use lib 't/lib'; >> } >> use Test::More; # the mistake! >> >> use Module::Build; >> my $build = Module::Build->current(); >> my $do_network_tests = $build->notes('network'); >> >> eval { >> require IO::String; >> require LWP; >> require LWP::UserAgent; >> }; >> if ($@) { >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot >> installed. >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping >> tests'; >> } >> elsif (!$do_network_tests) { >> plan skip_all => 'Network tests have not been requested, >> skipping >> all'; >> } >> else { >> plan tests => 21; >> } >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => $DEBUG); >> #... >> ---- >> >> To this: >> >> ---- >> use strict; >> >> BEGIN { >> use lib 't/lib'; >> use BioperlTest; >> >> test_begin(-requires_modules => [qw(IO::String LWP >> LWP::UserAgent)], >> -requires_networking => 1, >> -tests => 21); >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => test_debug()); >> #... >> ---- >> >> >> Can anyone identify problems with this approach? Is the interface >> presented by BioperlTest flexible enough that any changes would >> only be >> additions for new functionality (and therefore all test scripts >> wouldn't >> need to be altered)? Is BioperlTest missing anything you'd like? >> >> Are there any objections to me updating all tests in this manner? >> For an >> example, see t/RemoteBlast.t >> >> >> Cheers, >> Sendu. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From wollenbergk at mail.nih.gov Wed Jun 20 14:11:04 2007 From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID)) Date: Wed, 20 Jun 2007 14:11:04 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others Message-ID: Greetings: I am working on a script to take a list of sequence IDs, extract the sequences from GenPept, and then run a BLAST search for each of the retrieved sequences. I am having a problem with the sequence retrieval, where some sequences are found and others are not and it's not obvious to me why this is. For example, using a text file containing the two following IDs as input: SKG3_YEAST NEM1_YEAST My script while( ) { chomp; my $seqid = $_; my $seq_obj = get_sequence( 'genpept', $seqid ); } will create a sequence object for the first ID, (print "Accession of ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession number) but for the second I am told -------------------- WARNING --------------------- MSG: id (NEM1_YEAST) does not exist --------------------------------------------------- When I pull up these records using the Entrez cross-databse search in my web browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using these search terms). In both records these IDs reside in the same field ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one but not the other. Any advice would be greatly appreciated. Cheers, Kurt Wollenberg, Ph.D. Phylogenetics and Sequence Analysis Consultant Biocomputing Research Consulting Section Bioinformatics and Scientific IT Program (BSIP) NIH/NIAID/OTIS Contractor, Lockheed Martin http://bioinformatics.niaid.nih.gov Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives. From bosborne11 at verizon.net Wed Jun 20 14:59:39 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 20 Jun 2007 14:59:39 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: Message-ID: Kurt, I can't answer your question but I wouldn't use Bio::Perl myself, I'd use Bio::DB::GenPept: 501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq = $db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;' MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~> It's true that Bio::Perl is easy-to-use but it's also _very_ limited. Brian O. On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)" wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence retrieval, > where some sequences are found and others are not and it's not obvious to me > why this is. > > For example, using a text file containing the two following IDs as input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using > these search terms). In both records these IDs reside in the same field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is confidential > and may contain sensitive information. It should not be used by anyone who > is not the original intended recipient. If you have received this e-mail in > error please inform the sender and delete it from your mailbox or any other > storage devices. National Institute of Allergy and Infectious Diseases shall > not accept liability for any statements made that are sender's own and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Jun 20 16:11:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 15:11:34 -0500 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: References: Message-ID: I'm assuming you are using the Bio::Perl exported sub get_sequence (). I am able to reproduce the issue using bioperl-live; it's an odd issue as direct use of Bio::DB::GenPept works fine: use Bio::DB::GenPept; my $factory = Bio::DB::GenPept->new(); my @accs = qw(SKG3_YEAST NEM1_YEAST); my $io = $factory->get_Stream_by_acc(\@accs); while (my $seq = $io->next_seq) { print "Accession:",$seq->accession,"\n"; } chris On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence > retrieval, > where some sequences are found and others are not and it's not > obvious to me > why this is. > > For example, using a text file containing the two following IDs as > input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct > accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search > in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST > (using > these search terms). In both records these IDs reside in the same > field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence > finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is > confidential > and may contain sensitive information. It should not be used by > anyone who > is not the original intended recipient. If you have received this e- > mail in > error please inform the sender and delete it from your mailbox or > any other > storage devices. National Institute of Allergy and Infectious > Diseases shall > not accept liability for any statements made that are sender's own > and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sac at bioperl.org Thu Jun 21 02:32:47 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 20 Jun 2007 23:32:47 -0700 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com> Looks like a nice refactor. After it's in place, don't forget to update the wiki: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Steve On 6/20/07, Chris Fields wrote: > Agreed! You've already created an example case so there's something > to go off of. > > I plan on changing some EUtilities tests soon so I'll try > implementing this, basing off your RemoteBlast.t implementation. > Seems clear enough on the surface; if I run into problems I'll post. > > chris > > On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > > > Very cool! Sounds like a no-brainer to me to adopt this in all the > > tests. -hilmar > > > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > > > >> In considering updating all the test scripts to take advantage of the > >> new network option, and/or reimplementing them in Test::More, I > >> thought > >> now would be a good time to standardize all the test scripts and > >> reduce > >> the possibility of having to alter them all in the future if > >> something > >> changes. > >> > >> For example we could decide on an alternate way of choosing to run > >> network tests, or a new way of deciding to output debug information. > >> There are also some inconsistencies in the messages produced by tests > >> skipping all, and even an unfortunate mistake that has been copy/ > >> pasted > >> through a lot of test scripts. > >> > >> My solution is t/lib/BioperlTest.pm (documented with perldoc) > >> > >> We go from this: > >> > >> ---- > >> use strict; > >> our $DEBUG; > >> > >> BEGIN { > >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > >> > >> eval { require Test::More; }; > >> if( $@ ) { > >> use lib 't/lib'; > >> } > >> use Test::More; # the mistake! > >> > >> use Module::Build; > >> my $build = Module::Build->current(); > >> my $do_network_tests = $build->notes('network'); > >> > >> eval { > >> require IO::String; > >> require LWP; > >> require LWP::UserAgent; > >> }; > >> if ($@) { > >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > >> installed. > >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > >> tests'; > >> } > >> elsif (!$do_network_tests) { > >> plan skip_all => 'Network tests have not been requested, > >> skipping > >> all'; > >> } > >> else { > >> plan tests => 21; > >> } > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => $DEBUG); > >> #... > >> ---- > >> > >> To this: > >> > >> ---- > >> use strict; > >> > >> BEGIN { > >> use lib 't/lib'; > >> use BioperlTest; > >> > >> test_begin(-requires_modules => [qw(IO::String LWP > >> LWP::UserAgent)], > >> -requires_networking => 1, > >> -tests => 21); > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => test_debug()); > >> #... > >> ---- > >> > >> > >> Can anyone identify problems with this approach? Is the interface > >> presented by BioperlTest flexible enough that any changes would > >> only be > >> additions for new functionality (and therefore all test scripts > >> wouldn't > >> need to be altered)? Is BioperlTest missing anything you'd like? > >> > >> Are there any objections to me updating all tests in this manner? > >> For an > >> example, see t/RemoteBlast.t > >> > >> > >> Cheers, > >> Sendu. > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From staffa at niehs.nih.gov Thu Jun 21 14:36:12 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Thu, 21 Jun 2007 14:36:12 -0400 Subject: [Bioperl-l] BIO::DB::FASTA ID Message-ID: This program below returns only 1527 IDs from a fasta file that I have constructed, which has mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa 1820 . It actually does not return the first 3 ids, nor the 5th, nor 7..36, 38,39,41..44...... The header lines are of variable length and the sequence lines are 80 characters except at the ends when they might be shorter. Is there some caveat that I am ignoring in my format that breaks bio::db::fasta? #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; $|=1; # # my $Dpse_UTR_file_for_T_orthologs = "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; my $db = Bio::DB::Fasta->new ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', -reindex, -makeid => \&make_my_id); my @ids = $db->ids; my $number_in = @ids; print "number of Dpse IDs = $number_in\n"; foreach my $id (@ids){ print "$id\n"; } sub make_my_id { # parse header line: # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT my $line = shift; # print "line = $line\n"; $line =~ />(\w+) /; my $ID = $1; # print "ID = $ID\n"; return $ID; } -------------- next part -------------- A non-text attachment was scrubbed... Name: T_orthologs_Dpse_genes.fa Type: application/octet-stream Size: 5033676 bytes Desc: not available URL: From jason at bioperl.org Thu Jun 21 17:19:14 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 21 Jun 2007 14:19:14 -0700 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: Hey Nick - I think a) your IDs are not unique b) you need to declare the function make_my_id BEFORE your call Bio::DB::Fasta->new if you want your function to be used. $ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort | uniq | wc -l 1527 -jason On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 > TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From mkiwala at watson.wustl.edu Thu Jun 21 17:23:46 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Thu, 21 Jun 2007 16:23:46 -0500 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: <467AEC62.2040508@watson.wustl.edu> You only have 1527 unique id's in the file. ~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\ -f1|sort -u|wc -l 1527 Change your make_id function to make sure the id's are unique. Staffa, Nick (NIH/NIEHS) wrote: > This program below returns only 1527 IDs from a fasta file that I have > constructed, which has > mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa > 1820 > . > It actually does not return the first 3 ids, > nor the 5th, nor 7..36, 38,39,41..44...... > The header lines are of variable length and the sequence lines are 80 > characters except at the ends when they might be shorter. > Is there some caveat that I am ignoring in my format that breaks > bio::db::fasta? > > > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Jun 25 09:06:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:06:27 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: <467FBDD3.8050009@sendu.me.uk> Sendu Bala wrote: > In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm I'm now in the process of converting all test scripts. In addition to those things mentioned previously, BioperlTest now also provides the methods test_input_file() and test_output_file(). This: ---- use Bio::Root::IO; my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); $obj->new(-file => ">$output_file"); END { unlink($output_file); } ... $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); ---- Becomes this: ---- my $output_file = test_output_file(); $obj->new(-file => ">$output_file"); ... $obj->new(-file => test_input_file('input.file')); ---- I should think the benefits are obvious, especially for the output files, which thanks to inconsistency of using END blocks correctly or at all, leaves some output data behind on occasion. test_input_file() is helpful for the shorthand, but also gets rid of many tests' usage of Bio::Root::IO (relying on something you're installing and testing in another test script to work in the current test script, without testing it in your own test script seems like a no-no to me). From cjfields at uiuc.edu Mon Jun 25 09:39:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:39:21 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] t/lib/ >> BioperlTest.pm > > I'm now in the process of converting all test scripts. In addition to > those things mentioned previously, BioperlTest now also provides the > methods test_input_file() and test_output_file(). > > > This: > ---- > use Bio::Root::IO; > my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); > $obj->new(-file => ">$output_file"); > > END { > unlink($output_file); > } > > ... > > $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); > ---- > > > Becomes this: > ---- > my $output_file = test_output_file(); > $obj->new(-file => ">$output_file"); > > ... > > $obj->new(-file => test_input_file('input.file')); > ---- > > > I should think the benefits are obvious, especially for the output > files, which thanks to inconsistency of using END blocks correctly > or at > all, leaves some output data behind on occasion. Sounds fine by me, though it's a lot of work. BTW, did we ever decide whether to finish up with Test::More conversion? I haven't heard back yet; let me know what you want to do. > test_input_file() is helpful for the shorthand, but also gets rid of > many tests' usage of Bio::Root::IO (relying on something you're > installing and testing in another test script to work in the current > test script, without testing it in your own test script seems like a > no-no to me). Well, in a way isn't that itself a test of the class (whether it breaks or not)? ; > Do test_input_file() and test_input_file() handle directory structures in an OS-safe way like catfile()? For instance, I plan on adding test data to a new directory similar to Bio::Graphics (t/data/ eutil) to prevent cluttering of the t/data directory. I could use '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base directory is 't/data' but that may not be cross-platform compatible with win32 file systems, which may still expect something like 't\data \eutil\input.xml'. chris From bix at sendu.me.uk Mon Jun 25 09:45:23 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:45:23 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> Message-ID: <467FC6F3.6080705@sendu.me.uk> Chris Fields wrote: > On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >> I should think the benefits are obvious, especially for the output >> files, which thanks to inconsistency of using END blocks correctly or at >> all, leaves some output data behind on occasion. > > Sounds fine by me, though it's a lot of work. BTW, did we ever decide > whether to finish up with Test::More conversion? I haven't heard back > yet; let me know what you want to do. I'm doing the remaining Test::More conversions at the same time. > Do test_input_file() and test_input_file() handle directory structures > in an OS-safe way like catfile()? For instance, I plan on adding test > data to a new directory similar to Bio::Graphics (t/data/eutil) to > prevent cluttering of the t/data directory. I could use > '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base > directory is 't/data' but that may not be cross-platform compatible with > win32 file systems, which may still expect something like > 't\data\eutil\input.xml'. Its platform-independent, currently implemented using File::Spec. So you'll say: $obj->new(-file => test_input_file('eutil', 'input.xml')); Its all documented in the POD of BioperlTest. From cjfields at uiuc.edu Mon Jun 25 09:49:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:49:51 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FC6F3.6080705@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> <467FC6F3.6080705@sendu.me.uk> Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu> On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >>> I should think the benefits are obvious, especially for the output >>> files, which thanks to inconsistency of using END blocks >>> correctly or at >>> all, leaves some output data behind on occasion. >> Sounds fine by me, though it's a lot of work. BTW, did we ever >> decide whether to finish up with Test::More conversion? I haven't >> heard back yet; let me know what you want to do. > > I'm doing the remaining Test::More conversions at the same time. Okay. Just didn't want to do any redundant work if it's already being/been done. >> Do test_input_file() and test_input_file() handle directory >> structures in an OS-safe way like catfile()? For instance, I plan >> on adding test data to a new directory similar to Bio::Graphics (t/ >> data/eutil) to prevent cluttering of the t/data directory. I >> could use '$obj->new(-file => test_input_file('/eutil/ >> input.xml'))' if the base directory is 't/data' but that may not >> be cross-platform compatible with win32 file systems, which may >> still expect something like 't\data\eutil\input.xml'. > > Its platform-independent, currently implemented using File::Spec. > So you'll say: > > $obj->new(-file => test_input_file('eutil', 'input.xml')); > > Its all documented in the POD of BioperlTest. yay! chris From mmokrejs at ribosome.natur.cuni.cz Mon Jun 25 12:06:24 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Mon, 25 Jun 2007 18:06:24 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <467254DD.3010505@mrc-lmb.cam.ac.uk> Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz> Dave Howorth wrote: > Martin MOKREJ? wrote: >>>> Also, there is a *huge* amount of documentation and examples on >>>> the BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> ? ;-) >> $ perl embl2picture.pl ~/99.gb | display - Error returned while >> evaluating value of 'description' option for glyph >> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl >> line 141, line 125. > > Hmm an error at line 141 of a 69 line script? Methinks you're not > actually running the script that's presented on the wiki page you > quoted. I cut-and-pasted the script and your file and it worked for me > (at least, it produced an image, along with a bunch of OOPS lines) Maybe you used the first version of the script? There are two or more scripts, I used the very last one. M. From cjfields at uiuc.edu Mon Jun 25 12:48:30 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 11:48:30 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> <467FE7B0.3010904@ribosome.natur.cuni.cz> Message-ID: Martin, Keep bioperl-related discussion on the bioperl mail list. The large majority of this isn't biopython-related, but maybe some devs there can add to this? On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote: ... > Would you please tell me exactly what is wrong with the spacing? Here's a section of the seq record attached to your previous email: DEFINITION . ACCESSION . VERSION . SOURCE . ORGANISM . Normally there is a fixed column width for any data present in a field, so it would look more like this: DEFINITION PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase); dihydroorotase [Arabidopsis thaliana]. ACCESSION NP_194024 VERSION NP_194024.1 GI:15235865 DBSOURCE REFSEQ: accession NM_118422.3 KEYWORDS . SOURCE Arabidopsis thaliana (thale cress) ORGANISM Arabidopsis thaliana Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons; rosids; eurosids II; Brassicales; Brassicaceae; Arabidopsis. Here's the relevant bit in the latest release notes: "The second part of each sequence entry record contains the information appropriate to its keyword, in positions 13 to 80 for keywords and positions 11 to 80 for the sequence." The bioperl devs try to make our parsers as flexible as possible but others may not, so it's something in ApE that should probably be fixed. And as mentioned to you several times in the past on the mail list and on bugzilla, don't expect sequence records which sway from the standard (in this case, the release notes) to parse correctly in all cases. We can try supporting some that sway from that standard but only up to a point. If it causes additional bugs, headaches, or degrades performance it won't be supported. > ... > Well, I just copy&pasted the script from the bioperl webpages, I think > from a tutorial or FAQ, don't remember anymore. Well, can't help you if you can't point out where the code originated from. We would like to know so it can be corrected. > ... > Well, my search for such tools available on Unix to be used in a > script, > non-interactively, completely failed. My last hope except getting > improved > ApE is to use the GenomeDiagram under biopython, but so far my .gb > files > cannot be parsed yet. :( > Martin As mentioned previously you will likely have to code for it yourself (perl or python) or help debug the relevant biopython code to get it working. We can't/won't do this for you unless/until it's something we feel warrants implementation. Judging by the bug list, we also haven't the time nor inclination to code for it. Sorry but we have other priorities besides doing your work for you. chris From jesper at krogh.cc Tue Jun 26 03:05:32 2007 From: jesper at krogh.cc (Jesper Krogh) Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST) Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Hi List. Trying to parse the embl database, the embl-parser fails on: AB019196 http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: AB019196 seems to have an invalid species classification. STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 STACK: Bio::SeqIO::embl::_read_EMBL_Species /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 STACK: Bio::SeqIO::embl::next_seq /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 STACK: -e:1 ----------------------------------------------------------- It seems to be dissatisfied with this: OS Acetobacter aceti OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. Thanks. -- Jesper Krogh From cjfields at uiuc.edu Tue Jun 26 09:13:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 08:13:50 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> I can verify this using bioperl-live. Can you file this as a bug? http://bugzilla.open-bio.org/ chris On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > Hi List. > > Trying to parse the embl database, the embl-parser fails on: AB019196 > http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: AB019196 seems to have an invalid species classification. > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 > STACK: Bio::SeqIO::embl::_read_EMBL_Species > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 > STACK: Bio::SeqIO::embl::next_seq > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 > STACK: -e:1 > ----------------------------------------------------------- > > > It seems to be dissatisfied with this: > OS Acetobacter aceti > OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; > OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. > > Thanks. > -- > Jesper Krogh > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From suji_ramin at yahoo.com Tue Jun 26 00:58:36 2007 From: suji_ramin at yahoo.com (SujiBala) Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT) Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com> Hi Hello This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. Error messasge Must supply a valid Bio::Align::AlignI for the _align parameter in the distance My program use Bio::AlignIO; use Bio::Align::DNAStatistics; use Bio::Tree::DistanceFactory; # for a dna alignment can also use ProteinStatistics @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); $stats = Bio::Align::DNAStatistics->new; $mat = $stats->distance( -align => @aln,-method => 'Kimura'); $dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ'); $tree = $dfactory->make_tree($mat); I am using clustalw formatted fasta file with more than one sequence SujiBala --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. From bartels.stefan at mh-hannover.de Tue Jun 26 05:26:03 2007 From: bartels.stefan at mh-hannover.de (don esteban) Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT) Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <11302459.post@talk.nabble.com> Try using the Proxyconfiguration in your script: $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; L Xu wrote: > > I do have the internet connection bu not use the proxy server. > I tested the network connection with ping command (below). The ncbi > website > does not response. Is there any special network setting needed for > connecting the ncbi website? > Thank you so much. > > C:\>ping www.yahoo.com > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > Ping statistics for 69.147.114.210: > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > Approximate round trip times in milli-seconds: > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > C:\>ping www.ncbi.nlm.nih.gov > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > Request timed out. > Request timed out. > Request timed out. > Request timed out. > > Ping statistics for 130.14.29.110: > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > = = = Original message = = = > > Judging by the output it looks like you have no network access or? can't > connect to the server (what remoteblast needs).? Make sure you? don't need > proxy settings. > > To preempt the next question, no, I'm not going to explain what a? proxy > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > tool... > > chris > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From rahall2 at ualr.edu Tue Jun 26 09:51:08 2007 From: rahall2 at ualr.edu (Roger Hall) Date: Tue, 26 Jun 2007 08:51:08 -0500 Subject: [Bioperl-l] Tuesday: ill Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2> Well I guess I won't be in today after all. Michael, Stephen, and Ames: please call me from the grad office at 10 on my cell phone (744-8514). Phil: please go ahead and meet with Tim, and let me know what questions remain afterwards. Thanks! Roger Hall Technical Director MidSouth Bioinformatics Center University of Arkansas at Little Rock (501) 569-8074 From cjfields at uiuc.edu Tue Jun 26 10:02:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 09:02:29 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <4681185D.5030402@cam.ac.uk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> <4681185D.5030402@cam.ac.uk> Message-ID: Ill try getting to that ASAP (as well as a few bugs). The problem is we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due to repeated code issues, something I'm trying to rectify with a new set of parsers. Just haven't had the time to work on them lately unfortunately. chris On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote: > Sorry, replied to this but forgot to cc the list. > > It looks like a related problem to bug 2288 that I filed about > Bio::SeqIO::swiss - the period after subgen. is what causes the > problems since it is interpreted as a seperator between nodes. I > put a patch in for Bio::SeqIO::swiss that works for me, but I guess > it might have side effects. > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. > > Chris Fields wrote: >> I can verify this using bioperl-live. Can you file this as a bug? >> http://bugzilla.open-bio.org/ >> chris >> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: >>> Hi List. >>> >>> Trying to parse the embl database, the embl-parser fails on: >>> AB019196 >>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >>> >>> >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: AB019196 seems to have an invalid species classification. >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ >>> Root.pm:359 >>> STACK: Bio::SeqIO::embl::_read_EMBL_Species >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >>> STACK: Bio::SeqIO::embl::next_seq >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >>> STACK: -e:1 >>> ----------------------------------------------------------- >>> >>> >>> It seems to be dissatisfied with this: >>> OS Acetobacter aceti >>> OC Bacteria; Proteobacteria; Alphaproteobacteria; >>> Rhodospirillales; >>> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >>> >>> Thanks. >>> -- >>> Jesper Krogh >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rrc22 at cam.ac.uk Tue Jun 26 09:45:01 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 26 Jun 2007 14:45:01 +0100 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> Message-ID: <4681185D.5030402@cam.ac.uk> Sorry, replied to this but forgot to cc the list. It looks like a related problem to bug 2288 that I filed about Bio::SeqIO::swiss - the period after subgen. is what causes the problems since it is interpreted as a seperator between nodes. I put a patch in for Bio::SeqIO::swiss that works for me, but I guess it might have side effects. Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. Chris Fields wrote: > I can verify this using bioperl-live. Can you file this as a bug? > > http://bugzilla.open-bio.org/ > > chris > > On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > >> Hi List. >> >> Trying to parse the embl database, the embl-parser fails on: AB019196 >> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: AB019196 seems to have an invalid species classification. >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 >> STACK: Bio::SeqIO::embl::_read_EMBL_Species >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >> STACK: Bio::SeqIO::embl::next_seq >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >> STACK: -e:1 >> ----------------------------------------------------------- >> >> >> It seems to be dissatisfied with this: >> OS Acetobacter aceti >> OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; >> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >> >> Thanks. >> -- >> Jesper Krogh >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Tue Jun 26 10:13:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 26 Jun 2007 15:13:48 +0100 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> Message-ID: <46811F1C.3020307@sendu.me.uk> SujiBala wrote: > Hi Hello > This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. > > Error messasge > Must supply a valid Bio::Align::AlignI for the _align parameter in the distance > My program > use Bio::AlignIO; > use Bio::Align::DNAStatistics; > use Bio::Tree::DistanceFactory; > # for a dna alignment can also use ProteinStatistics > @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); > $stats = Bio::Align::DNAStatistics->new; > $mat = $stats->distance( -align => @aln,-method => 'Kimura'); Without looking at the docs for these modules, it is immediately obvious that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO and not an array of alignments. It is also obvious that the -align => parameter for the distance() method can't take an array of anything (but probably an array ref?). Check the documentation and make sure you know what objects you're generating and passing around. From schlesi at ebi.ac.uk Tue Jun 26 10:59:13 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Tue, 26 Jun 2007 15:59:13 +0100 Subject: [Bioperl-l] PAML parser Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Hello, I am trying to use the PAML result parser (BioPerl Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. However on all outputs I have tested no result object is returned (next_result is undef). This includes the HIV and Lysin datasets included with PAML. My code is: my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => "/."); my $result = $codemlp->next_result; foreach my $model ( $result->get_NSSite_results ) { ... and the error is: Can't call method "get_NSSite_results" on an undefined value ... I can include the mlc file is needed. Is this supposed to work? Or do I have to run paml from bioperl to parse the results? Thanks Felix From Xianjun.Dong at bccs.uib.no Tue Jun 26 10:35:17 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 16:35:17 +0200 Subject: [Bioperl-l] bug for PAML::Baseml Message-ID: <46812425.8000509@ii.uib.no> An HTML attachment was scrubbed... URL: From Xianjun.Dong at bccs.uib.no Tue Jun 26 11:40:47 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 17:40:47 +0200 Subject: [Bioperl-l] bug for PAML::Baseml In-Reply-To: <46812425.8000509@ii.uib.no> References: <46812425.8000509@ii.uib.no> Message-ID: <4681337F.1000902@ii.uib.no> An HTML attachment was scrubbed... URL: From hartzell at alerce.com Tue Jun 26 14:12:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 14:12:04 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.22260.967524.353173@almost.alerce.com> There don't seem to be any .cvsignore files in the repository, or in CVSROOT/cvsignore. Am I missing something, or don't we use them? g. From cjfields at uiuc.edu Tue Jun 26 15:54:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 14:54:25 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu> Not sure. You may want to email support at open-bio.org; my guess is Chris D or Jason would have an answer. chris On Jun 26, 2007, at 1:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Tue Jun 26 15:55:21 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 26 Jun 2007 16:55:21 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: Maybe we've been using the default? On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Tue Jun 26 16:21:30 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 16:21:30 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.30026.61328.134490@almost.alerce.com> Chris Fields writes: > [...] > It looks like George Hartzell may be taking a crack at it, with > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > could have something testable relatively soon. After that we'll need > to work out a few other issues, basically what's on Hilmar's list. There's a repository on file:///home/hartzell/bioperl with all of the components projects in place. If you have a dev.open-bio.org account and you're in the bioperl group, you're good to get at it via: file:///home/hartzell/bioperl or svn+ssh://dev.open-bio.org/home/hartzell/bioperl There are a couple of things to think about: - how are we going to provide access. I *think* that I heard a decision to use http:// and https://. Who gets to set that up? - what do we want to do about keywords. The cvs2svn tool guesses and automatically sets the svn:keywords property to Author Date Revision and Id on many of the files in the tree. If it looks like it got it right, we can stick with it. Or, we can disable that conversion and I've cribbed a little script that'll grep out files using Id and set the svn:keywords property accordingly. - what do we want to do about svn:ignore? I haven't seen any .cvsignore files. Beyond that, how does the repo look? How are we going to cut over? Are we going to try to push svn commits to the read-mostly CVS repo, or just keep it around for history's sake (I lean towards the latter). g. From jason at bioperl.org Tue Jun 26 19:22:20 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:22:20 -0300 Subject: [Bioperl-l] PAML parser In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Message-ID: Can you make sure you have the latest and greatest version of these modules from the CVS repository? We had to fix things to parse 3.15 -- I can't tell if this is the problem or something else. You can also add -verbose => 1when you initialize the object and it may spit out more warnings about whether it is having problems. -jason On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote: > Hello, > > I am trying to use the PAML result parser (BioPerl > Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. > However on all outputs I have tested no result object is returned > (next_result is undef). This includes the HIV and Lysin datasets > included with PAML. > My code is: > > my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => > "/."); > my $result = $codemlp->next_result; > foreach my $model ( $result->get_NSSite_results ) { > ... > > and the error is: Can't call method "get_NSSite_results" on an > undefined value ... > > I can include the mlc file is needed. Is this supposed to work? Or do > I have to run paml from bioperl to parse the results? > > Thanks > Felix > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 19:27:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:27:05 -0300 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <46811F1C.3020307@sendu.me.uk> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> <46811F1C.3020307@sendu.me.uk> Message-ID: On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote: > SujiBala wrote: >> Hi Hello >> This is sujatha from singapore. I am trying to construct phylo >> tree using DNAStatistics and Kirma method. But I am getting the >> following error message. It would be nice if you could help me >> resolve this problem asap. >> >> Error messasge >> Must supply a valid Bio::Align::AlignI for the _align >> parameter in the distance >> My program >> use Bio::AlignIO; >> use Bio::Align::DNAStatistics; >> use Bio::Tree::DistanceFactory; >> # for a dna alignment can also use ProteinStatistics >> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); >> $stats = Bio::Align::DNAStatistics->new; >> $mat = $stats->distance( -align => @aln,-method => 'Kimura'); > yep you want to call next_aln on the Bio::AlignIO object. I fixed the example code in the HOWTO so it should work properly now; http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees > Without looking at the docs for these modules, it is immediately > obvious > that Bio::AlignIO->new() is going to return an instance of > Bio::AlignIO > and not an array of alignments. It is also obvious that the -align => > parameter for the distance() method can't take an array of anything > (but > probably an array ref?). > > Check the documentation and make sure you know what objects you're > generating and passing around. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 19:29:11 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:29:11 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org> We don't have one. I have one on my local machine that defined basically *~ and .#* so I never had a problem. Feel free to propose one if you think it is important, I never really though it was important. On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote: > Maybe we've been using the default? > > On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > >> >> There don't seem to be any .cvsignore files in the repository, or in >> CVSROOT/cvsignore. >> >> Am I missing something, or don't we use them? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From j_martin at lbl.gov Tue Jun 26 21:01:29 2007 From: j_martin at lbl.gov (Joel Martin) Date: Tue, 26 Jun 2007 18:01:29 -0700 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: <11302459.post@talk.nabble.com> References: <11302459.post@talk.nabble.com> Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org> Hello, The tutorial code snippet is an endless loop, I think it's supposed to remove the rid. As the only print statement you added is after the endless loop, you aren't seeing anything happen. Use the code from this instead, perldoc Bio::Tools::Run::RemoteBlast The bptutorial.pl does have a note that it's not useful and to read the pod for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code snippet you used. Though, as it's a tutorial example it might be nice to remove the while loop .. or at least add the sleep(5) part. http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29 Aside from that, you may have network issues but www.ncbi.nlm.nih.gov doesn't respond to ping as far as I can tell. Joel On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote: > > Try using the Proxyconfiguration in your script: > > $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; > > > > > L Xu wrote: > > > > I do have the internet connection bu not use the proxy server. > > I tested the network connection with ping command (below). The ncbi > > website > > does not response. Is there any special network setting needed for > > connecting the ncbi website? > > Thank you so much. > > > > C:\>ping www.yahoo.com > > > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > > > Ping statistics for 69.147.114.210: > > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > > Approximate round trip times in milli-seconds: > > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > > > C:\>ping www.ncbi.nlm.nih.gov > > > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > > > Request timed out. > > Request timed out. > > Request timed out. > > Request timed out. > > > > Ping statistics for 130.14.29.110: > > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > > > > > = = = Original message = = = > > > > Judging by the output it looks like you have no network access or? can't > > connect to the server (what remoteblast needs).? Make sure you? don't need > > proxy settings. > > > > To preempt the next question, no, I'm not going to explain what a? proxy > > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > > tool... > > > > chris > > > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > > > > ... > > -------------------- WARNING --------------------- > > MSG: > > An Error Occurred > > > >

An Error Occurred

> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > > > > > --------------------------------------------------- > > ... > > > > ___________________________________________________________ > > Sent by ePrompter, the premier email notification software. > > Free download at http://www.ePrompter.com. > > > > _________________________________________________________________ > > Get a preview of Live Earth, the hottest event this summer - only on MSN > > http://liveearth.msn.com?source=msntaglineliveearthhm > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melvinp at pacific.net.sg Wed Jun 27 01:25:08 2007 From: melvinp at pacific.net.sg (Melvin P) Date: Wed, 27 Jun 2007 13:25:08 +0800 Subject: [Bioperl-l] finding statistics on AA Message-ID: <4681F4B4.8010609@pacific.net.sg> Hi, I am new to BioPerl. I am trying to find out if there is any class that I can use for occupancy number/occurrence counts, psuedo count, observed frequency etc given a few sequences of amino acid. For example, what is the observed frequency of residue i at position p. My objective is to analyze the information content. Thanks. From bix at sendu.me.uk Wed Jun 27 06:23:58 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 11:23:58 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <46823ABE.2080300@sendu.me.uk> Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] >> t/lib/BioperlTest.pm > > I'm now in the process of converting all test scripts. And I've now completed that job (for bioperl-live at least), except for t/EUtilities.t since I know Chris is working on it. In addition to converting to Test::More where necessary, I've also made all psuedo-TODO blocks real ones. Previously I had advised to use SKIP blocks instead since TODO blocks need a Test::Harness upgrade. However I think in the next release we ought to make such upgrading compulsory (which should be automatic when combined with compulsory usage of Module::Build and Test::More in turn: users simply have to update CPAN). The conversion to BioperlTest directly led to the discovery and fixing of 6 minor bugs, so was certainly not without merit. No user or developer needs to have BIOPERLDEBUG permanently set to true anymore. To run all tests you just have to answer yes to the BioDBGFF and networking questions of 'perl Build.PL'. With './Build test' you then get clean, easy-to-read output where it is obvious to see that we currently have these issues: t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in another thread. t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and t/Annotation.t all have TODO tests. If you know about those modules, now would be a great time to implement those TODOs! Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are deprecated' warnings. To debug a particular test you could say: BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t I've updated the HOWTO for writing test scripts: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests From cjfields at uiuc.edu Wed Jun 27 07:55:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 06:55:47 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except > for > t/EUtilities.t since I know Chris is working on it. The network tests will be much shorter; the bulk will be transferred to a new suite for the backend Bio::Tools:EUtilities parser (which will test static files in t/data/eutils, so no dynamic changes). > In addition to converting to Test::More where necessary, I've also > made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. > However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update > CPAN). Sounds good to me, but there may be some grumblings out there. Having specific TODOs are nice b/c we can test them w/o fails. Handy. > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to > true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those > modules, now > would be a great time to implement those TODOs! The RNA_SearchIO.t is from ERPIN output; there's no easy way to generate it beyond having the user supply the info (or having the program author change the output). Will have to look at the others to see what's involved; maybe something for the priority list? > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. I ran into this with XML::Simple data structures recently; there was an easy way around it via XML::Simple using forcearray(). It has to do with attempting to assign data to/from a hash in a specific way involving array references (though I can't remember exactly how; I slept since then). > To debug a particular test you could say: > BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t > > > I've updated the HOWTO for writing test scripts: > http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Good work! chris From schlesi at ebi.ac.uk Wed Jun 27 07:57:27 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Wed, 27 Jun 2007 12:57:27 +0100 Subject: [Bioperl-l] Selecting columns from alignment Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com> Hi, is there an elegant way to select columns from an alignment object fulfilling a certain property (for example less than x gaps)? Everything I can see from Align::AlignI seems to involve looking at the individual sequences, creating lots of slices and appending them. If there a better way in bioperl or failing that, does anyone know a software package with similar functionality (t-coffee has lots of filters for alignments, but nothing to select columns besides by position it seems). Ideally this would also return a mapping from old to new positions in one of the sequences of course. Thanks Felix From cjfields at uiuc.edu Wed Jun 27 10:36:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 09:36:41 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ... > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I managed to get it working using file://. Haven't tried svn+ssh yet but I've had persistent problems getting ssh to work properly on my macbook; not sure why yet but I haven't had time to play around with it. > There are a couple of things to think about: > > - how are we going to provide access. I *think* that I heard a > decision to use http:// and https://. Who gets to set that up? That hasn't been decided yet and will be up to a consensus of the core devs, but I think the odds are in favor of allowing https:// but against allowing http://. As for setup that could be anyone with admin privs, though it may be best left up to Chris D, Jason, or Mauricio. > - what do we want to do about keywords. The cvs2svn tool guesses > and automatically sets the svn:keywords property to Author Date > Revision and Id on many of the files in the tree. If it looks > like it got it right, we can stick with it. Or, we can disable > that conversion and I've cribbed a little script that'll grep out > files using Id and set the svn:keywords property accordingly. Probably again a consensus issue, but you can choose one route. My inclination is the former if it's easier. > - what do we want to do about svn:ignore? I haven't seen any > .cvsignore files. Not sure. I've never used one personally, but (as Jason suggests) if you have ideas for one you can propose them, or we can suggest devs set up svn::ignore locally. > Beyond that, how does the repo look? Seems fine, though a simple 'svn file:///home/hartzell/bioperl' checkout gets everything (all distros, branches, etc). We need to make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- live/trunk /live' or similar if they just want the latest core/db/etc. We'll also need to start a svn wiki page to show how to get relevant distros (similar in style probably to the cvs page, with dev information, how to set up ssh keys, https stuff, etc). > How are we going to cut over? > > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I think a clean cut-over. Everyone would be warned to hold commits for a day (lest they be lost), then probably do something in this order: - switch cvs to read-only except for svn commits - run a clean cvs2svn - set up svn as read/write - set up test commits to cvs via svn - disable cvs commit messages to bioperl-guts, enable svn commit messages in it's place. - push svn commits over to read-only cvs cvs >>must<< be read-only after that point (no cvs->svn commits), with write access only available through svn. If at some future point there is no reason to keep it around or that it is more trouble than it's worth, we can make a decision then on cvs's fate. > g. chris From rvos at interchange.ubc.ca Wed Jun 27 10:23:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT) Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point. Rutger From cjfields at uiuc.edu Wed Jun 27 11:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 10:18:03 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> On Jun 27, 2007, at 9:23 AM, rvos wrote: > >> Are we going to try to push svn commits to the read-mostly CVS repo, >> or just keep it around for history's sake (I lean towards the >> latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. > > Rutger Most projects make a clean break with cvs (no more commits) for the reasons you point out. Not sure how the other core devs feel about that but I could go for that; it would def. prevent headaches. We could keep cvs for the time being as read-only, with no svn->cvs syncing. There are few projects which have (as a phase-out plan) old read-only cvs repositories available, with an automatic svn->cvs commit following every new svn commit. Not sure how that works, esp. for branching/merging and so on which I could see potentially getting hairy. chris From cjfields at uiuc.edu Wed Jun 27 12:05:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 11:05:49 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ...If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl Did manage to get svn+ssh working (with some password harassment); core tests passed enough that I think everything's okay. If ssh keys are set up correctly (mine aren't) it should work fine. chris From dmessina at wustl.edu Wed Jun 27 12:27:32 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 11:27:32 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: > [Chris] > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around > with it. I just did a checkout and a test commit, both via svn+ssh -- works great for me. >> [George] >> >> - what do we want to do about keywords. The cvs2svn tool guesses >> and automatically sets the svn:keywords property to Author Date >> Revision and Id on many of the files in the tree. If it looks >> like it got it right, we can stick with it. Or, we can disable >> that conversion and I've cribbed a little script that'll grep out >> files using Id and set the svn:keywords property accordingly. I would think we would want "Author Date Id Rev URL" set on everything, no?. So either cvs2svn or your tool (whichever you think is better), followed by svn propset svn:keywords "Author Date Id Rev URL" * from the root of a working copy would take care of all of the existing files in the repository, I think. George knows more about this than I do, but I think you can set up a global config file with enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" to ensure it gets set on any future additions to the repository. >> - what do we want to do about svn:ignore? I haven't seen any >> .cvsignore files. > > Not sure. I've never used one personally, but (as Jason suggests) if > you have ideas for one you can propose them, or we can suggest devs > set up svn::ignore locally. I use the default global-ignores global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store (again, in my system-wide config file), but I'm not tied to that. I do think we should have one, though; individuals can easily override any settings in the system-wide config with their own ~/.subversion/ config. >> Beyond that, how does the repo look? Looks great, George! Thanks for doing this. Dave From hartzell at alerce.com Wed Jun 27 13:00:53 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 13:00:53 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <18050.38853.526224.791878@almost.alerce.com> rvos writes: > > > Are we going to try to push svn commits to the read-mostly CVS repo, > > or just keep it around for history's sake (I lean towards the latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. There had been some point of keeping a CVS repository around as a read-only mirror of the svn repo, presumably for people who's habits or setup won't let them use svn. In theory, each commit to the svn repo can be automagically pushed down into CVS w/out user intervention, google will tell you how but I've never run anything that way. g. From dmessina at wustl.edu Wed Jun 27 13:27:01 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 12:27:01 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu> > [Chris] > We'll also need to start a svn wiki page to show how to get relevant > distros (similar in style probably to the cvs page, with dev > information, how to set up ssh keys, https stuff, etc). I cloned the CVS page and have started adapting it for Subversion: http://www.bioperl.org/wiki/Using_Subversion I'll do some more on it later today, but if anyone wants to fiddle with it in the interim, please do. Dave From n.haigh at sheffield.ac.uk Wed Jun 27 14:44:16 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 19:44:16 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: <4682B000.2050707@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except for > t/EUtilities.t since I know Chris is working on it. > > > In addition to converting to Test::More where necessary, I've also made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update CPAN). > > > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those modules, now > would be a great time to implement those TODOs! > > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. Ah, that reminds me! I recently tried to do an install of the cvs head (a week or two ago) on a clean installation of Debian 4.0 (etch). During the installation, of dependencies, Bio::ASN1::EntrezGene threw an error as it depends on Bioperl. I seem to remember this circular dependency cropping up before - am I correct - and can you remind me how this was "fixed"? Cheers Nath From bix at sendu.me.uk Wed Jun 27 14:52:01 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 19:52:01 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B1D1.3080206@sendu.me.uk> Nathan S. Haigh wrote: > I recently tried to do an install of the cvs head (a week or two ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up before > - am I correct - and can you remind me how this was "fixed"? Yes, it always happens. It was 'fixed' by being completely ignored by me. Installation is guaranteed to fail, but if you really want it, trying to install again after you already have Bioperl installed will result in success. Clearly something nicer could be done. Suggestions on a postcard... From cjfields at uiuc.edu Wed Jun 27 15:01:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:01:01 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > Sendu Bala wrote: >> ... >> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >> deprecated' warnings. > > Ah, that reminds me! > > I recently tried to do an install of the cvs head (a week or two > ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up > before > - am I correct - and can you remind me how this was "fixed"? > > Cheers > Nath Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of Bioperl (and he could be come a dev). That would solve it. chris From n.haigh at sheffield.ac.uk Wed Jun 27 15:16:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 20:16:40 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B798.1010409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > >> Sendu Bala wrote: >>> ... >>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >>> deprecated' warnings. >> >> Ah, that reminds me! >> >> I recently tried to do an install of the cvs head (a week or two ago) on >> a clean installation of Debian 4.0 (etch). During the installation, of >> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on >> Bioperl. I seem to remember this circular dependency cropping up before >> - am I correct - and can you remind me how this was "fixed"? >> >> Cheers >> Nath > > Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of > Bioperl (and he could be come a dev). That would solve it. > > chris Just to put the feelers out to see what people think. It seems (to me at least) that Bioperl modules could/should? be released as individual modules and that "bioperl" would really constitute a "bundle" of all these modules - in terms of CPAN anyway. Am I correct in this thinking? The Bio::ASN1::EntrezGene could simply require a particular module rather than the whole of bioperl - might get out of the circular dependency theoretically!? I'm not suggesting moving in this direction, but just wondered what others thought about this concept? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X tOFQUQ7cGJLUITEDw1+QLxc= =Yc+g -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 15:31:44 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:31:44 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu> On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote: > ... > > Just to put the feelers out to see what people think. > > It seems (to me at least) that Bioperl modules could/should? be > released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I > correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? > > I'm not suggesting moving in this direction, but just wondered what > others thought about this concept? > > Nath Well, Steve suggested splitting some of core into distinct groups, which I tend to agree with in some respects (speed up releases for those modules, such as SearchIO, DB, Graphics). The problem we have yet to solve is what we consider 'core'. Is it Bio::Seq and related? Should it include Bio::DB*? Should it just be Bio::* modules with no or very few external dependencies? And so on..., probably not a decision we want to make immediately (until after svn migration, tests finished, maybe a release or two, a beer)... The Bioperl module dependency that Bio::ASN1::EntrezGene has is Bio::Index::AbstractSeq. You could try a test build of Bio::ASN1::EntrezGene to see what happens. chris From hlapp at gmx.net Wed Jun 27 15:49:15 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:49:15 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 1:27 PM, David Messina wrote: > I would think we would want "Author Date Id Rev URL" set on > everything, no?. So either cvs2svn or your tool (whichever you think > is better), followed by > > svn propset svn:keywords "Author Date Id Rev URL" * Shouldn't this be done recursively? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 15:50:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:50:27 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > Most projects make a clean break with cvs (no more commits) for the > reasons you point out. Not sure how the other core devs feel about > that but I could go for that; it would def. prevent headaches. There shouldn't be any cvs write support after the cut-over I think. I don't see the benefit that would justify the huge headache potential. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 16:01:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:01:40 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I > think. I don't see the benefit that would justify the huge headache > potential. > > -hilmar Agreed, so maybe we should set that in stone. That means no svn->cvs syncing post-migration as well, I assume. Now how about a quick straw poll, what kind of access? svn+ssh is already available, but some (Aaron among them) have indicated they would like https as well (not sure how involved it would be to set up). chris From hlapp at gmx.net Wed Jun 27 16:08:40 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:08:40 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > That means no svn->cvs syncing post-migration as well, I assume. That's a bit of a different story. People out there have URL links into our anonymous CVS repository. If it's not too troublesome (and tend to I think it's not) I'd like to maintain those in working order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi script that maps between the URL flavors (i.e., that maps a CVS-style URL to the equivalent SVN link). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 16:15:10 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 16:15:10 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18050.50510.84363.355034@almost.alerce.com> David Messina writes: > > [Chris] > > > > I managed to get it working using file://. Haven't tried svn+ssh yet > > but I've had persistent problems getting ssh to work properly on my > > macbook; not sure why yet but I haven't had time to play around > > with it. > > I just did a checkout and a test commit, both via svn+ssh -- works > great for me. Is there anyone working outside of bioperl-{run,live,ext}? g. From bix at sendu.me.uk Wed Jun 27 16:22:13 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 21:22:13 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <4682C6F5.4020406@sendu.me.uk> Nathan S. Haigh wrote: > It seems (to me at least) that Bioperl modules could/should? be released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? No, it wouldn't. The 'problem' only arises because the user is /choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same time. So even if Bioperl was released as separate modules there would still be that 'bundle' and users would still choose to do the same thing: install all the Bioperl modules as well as all its /optional/ recommended modules. And there lies the problem: Bio::ASN1::EntrezGene requires Bioperl modules, and one Bioperl module requires Bio::ASN1::EntrezGene, so the circularity isn't solved. (FYI: Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq Bio::Index::AbstractSeq requires a couple of Bioperl modules, including Bio::Root::Root Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of Bioperl modules, including Bio::Root::Root. ) You only avoid circularity by choosing not to install everything in one go. Which is something you can do right now with no problems. From n.haigh at sheffield.ac.uk Wed Jun 27 16:24:18 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 21:24:18 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <4682C772.5070502@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I think. > I don't see the benefit that would justify the huge headache potential. > > -hilmar I agree. A clean switch from cvs read/write to svn read/write plus cvs read only sounds the least problematic! However, how will links to cvs be dealt with? Links on Bioperl could be switched over to point to svn, but what about possible links from external sources? Maybe a more generic approach of redirection could work? Or a simple warning page stating the fact that we have moved from cvs to svn and provide a common link to follow? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y 81KurFwJlRtYFxSmLZP56Sk= =pp7b -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 16:30:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:30:19 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > Cool - this works for me. One thing I notice is that in cvs log you see which version is in which branch which is useful to answer user queries that might be a version problem. svn log doesn't seem to want to show that. Does anyone have ideas for how to do this in svn? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 16:32:18 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:32:18 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4682C772.5070502@sheffield.ac.uk> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <4682C772.5070502@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote: > However, how will links to cvs be dealt with? Well I said before that probably one can write a couple of lines of Perl to write a cgi script that returns the appropriate redirect URL with a redirect status code. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y f6sJ/ngeKEGpKHgyAHM1DAA= =8n0E -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 16:50:11 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:50:11 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote: > > On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl >> > > Cool - this works for me. > > One thing I notice is that in cvs log you see which version is in > which branch which is useful to answer user queries that might be a > version problem. svn log doesn't seem to want to show that. Does > anyone have ideas for how to do this in svn? > > -hilmar We prob. should move it to a new directory ASAP which george can write to when he needs to update. cvs is in /home/repository/ bioperl, so maybe something similar, like /home/svn/repository/bioperl? chris From cjfields at uiuc.edu Wed Jun 27 16:51:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:51:37 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu> On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > >> That means no svn->cvs syncing post-migration as well, I assume. > > That's a bit of a different story. People out there have URL links > into our anonymous CVS repository. If it's not too troublesome (and > tend to I think it's not) I'd like to maintain those in working > order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi > script that maps between the URL flavors (i.e., that maps a CVS- > style URL to the equivalent SVN link). > > -hilmar I'll try getting a wiki page up as a checklist for this, including what direction we're heading in, ideas (your list and CGI redirect ideas, svn::ignore issues, etc). Dave has already started on the 'getting bioperl using svn' wiki page. If we intend to sync cvs with svn we need to find the right tools or at least check for other projects which have done something similar. I haven't googled on that yet but I'll attempt to tonight. chris From cjfields at uiuc.edu Wed Jun 27 16:53:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:53:08 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> bioperl-run also. I think the run CVS repo has some binary files, so if there are any problems with cvs2svn it'll be there. chris On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > George, > > bioperl-db and bioperl-network should be included, I think. > > Brian O > > > On 6/27/07 4:15 PM, "George Hartzell" wrote: > >> David Messina writes: >>>> [Chris] >>>> >>>> I managed to get it working using file://. Haven't tried svn >>>> +ssh yet >>>> but I've had persistent problems getting ssh to work properly on my >>>> macbook; not sure why yet but I haven't had time to play around >>>> with it. >>> >>> I just did a checkout and a test commit, both via svn+ssh -- works >>> great for me. >> >> Is there anyone working outside of bioperl-{run,live,ext}? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Jun 27 17:05:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 22:05:50 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682C6F5.4020406@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> Message-ID: <4682D12E.3000803@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> It seems (to me at least) that Bioperl modules could/should? be released >> as individual modules and that "bioperl" would really constitute a >> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >> this thinking? The Bio::ASN1::EntrezGene could simply require a >> particular module rather than the whole of bioperl - might get out of >> the circular dependency theoretically!? > > No, it wouldn't. [snip] > You only avoid circularity by choosing not to install everything in one > go. Errr... I take that back. Since CPAN bundles install things in a certain order, you just have to make sure that everything Bio::ASN1::EntrezGene needs is installed first, then Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. But the main problem with this approach is that maintenance, global-style code improvements and releases become a nightmare. I could, perhaps, imagine a scenario where the repository stayed as-is (one monolithic collection), but the dist action of Build.PL could be altered to generate a release package per module instead of one big release package of all modules, as is currently the case. Is there much value in doing that? Does anyone want me to look into the feasibility of such a thing? From bosborne11 at verizon.net Wed Jun 27 16:19:47 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 27 Jun 2007 16:19:47 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18050.50510.84363.355034@almost.alerce.com> Message-ID: George, bioperl-db and bioperl-network should be included, I think. Brian O On 6/27/07 4:15 PM, "George Hartzell" wrote: > David Messina writes: >>> [Chris] >>> >>> I managed to get it working using file://. Haven't tried svn+ssh yet >>> but I've had persistent problems getting ssh to work properly on my >>> macbook; not sure why yet but I haven't had time to play around >>> with it. >> >> I just did a checkout and a test commit, both via svn+ssh -- works >> great for me. > > Is there anyone working outside of bioperl-{run,live,ext}? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Wed Jun 27 17:25:53 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 22:25:53 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <4682D5E1.2030507@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get out of >>> the circular dependency theoretically!? >> >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything in >> one go. > > Errr... I take that back. Since CPAN bundles install things in a certain > order, you just have to make sure that everything Bio::ASN1::EntrezGene > needs is installed first, then Bio::ASN1::EntrezGene, then > Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, > global-style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be altered > to generate a release package per module instead of one big release > package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into the > feasibility of such a thing? I think the value would be in other external modules being able to use bioperl modules with more ease (not sure how many modules have, or currently depend on bioperl) as they would depend on a single module, rather than the whole package. However, how would the dependencies of each module be handled? I'm clearly thinking aloud, but....Maybe this would tease apart "cliques" of modules that are interdependent? and could in themselves be shipped as bundles e.g. Bio::Graphics and have a "master" bioperl bundle that installa all the bioperl modules. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB 2EZjccEFEzfFlx4H47gzwLk= =nobl -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 17:35:28 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 18:35:28 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Is there a reason not to port every subproject over? -hilmar On Jun 27, 2007, at 5:53 PM, Chris Fields wrote: > bioperl-run also. I think the run CVS repo has some binary files, so > if there are any problems with cvs2svn it'll be there. > > chris > > On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > >> George, >> >> bioperl-db and bioperl-network should be included, I think. >> >> Brian O >> >> >> On 6/27/07 4:15 PM, "George Hartzell" wrote: >> >>> David Messina writes: >>>>> [Chris] >>>>> >>>>> I managed to get it working using file://. Haven't tried svn >>>>> +ssh yet >>>>> but I've had persistent problems getting ssh to work properly >>>>> on my >>>>> macbook; not sure why yet but I haven't had time to play around >>>>> with it. >>>> >>>> I just did a checkout and a test commit, both via svn+ssh -- works >>>> great for me. >>> >>> Is there anyone working outside of bioperl-{run,live,ext}? >>> >>> g. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 17:36:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:36:29 -0500 Subject: [Bioperl-l] Splits again, formerly Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be >>> released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I >>> correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get >>> out of >>> the circular dependency theoretically!? >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything >> in one go. > > Errr... I take that back. Since CPAN bundles install things in a > certain order, you just have to make sure that everything > Bio::ASN1::EntrezGene needs is installed first, then > Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, global- > style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be > altered to generate a release package per module instead of one big > release package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into > the feasibility of such a thing? Not for the time being, at least in my opinion. Too much on our plate at this point with svn migration, test conversion, bugzilla running over (next point of attack!), etc. Maybe something to think about after, though I like the idea of a few splits to core as Steve suggested (SearchIO, Graphics, some LWP-related DB modules). My (albeit extreme) thought is to have a lean-and-mean set of 'core' modules with as few external dependencies as possible, which could work around the circular dependency issue in this case: dep.on dep.on Bio::Auxiliary -----> ASN1::EntrezGene -----> core (with EntrezGene) (basic SeqIO, Index, DB, etc) \---->------>--- dep.on ->----->----->----/ Bioperl auxiliary modules would list core as a required dependency along with anything else needed for that particular aux. section (i.e. XML parsers, LWP, GD, etc.). The whole mess, if needed, would be installed using Bundle::BioPerl or similar, with no part released w/o testing on the whole 'base' to ensure proper interaction. If a fix needed to be made in one set, make the fix, test against bioperl 'base' as a whole, and release when possible. No need to wait for a full-fledged 1.5.3 release. Maybe wishful thinking... chris From cjfields at uiuc.edu Wed Jun 27 17:44:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:44:47 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> We should port them all, yes. chris On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > Is there a reason not to port every subproject over? > > -hilmar From cjfields at uiuc.edu Wed Jun 27 17:53:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:53:02 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: >> ... >> Is there much value in doing that? Does anyone want me to look >> into the >> feasibility of such a thing? > > > I think the value would be in other external modules being able to use > bioperl modules with more ease (not sure how many modules have, or > currently depend on bioperl) as they would depend on a single module, > rather than the whole package. However, how would the dependencies of > each module be handled? I'm clearly thinking aloud, but....Maybe this > would tease apart "cliques" of modules that are interdependent? and > could in themselves be shipped as bundles e.g. Bio::Graphics and > have a > "master" bioperl bundle that installa all the bioperl modules. See my response to Sendu, and Steve Chervitz's original post and related thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 which pretty much covers the same ground. I think at most 4-5 split 'cliques', including core, with the fewest possible dependencies in core. If we do any of this, it prob. should wait until after an svn migration and bugzilla bug stomping unless there is a (well-argued) advantage to doing it now. chris From n.haigh at sheffield.ac.uk Wed Jun 27 18:07:31 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 23:07:31 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> Message-ID: <4682DFA3.9090100@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: > >>> ... >>> Is there much value in doing that? Does anyone want me to look into the >>> feasibility of such a thing? >> >> >> I think the value would be in other external modules being able to use >> bioperl modules with more ease (not sure how many modules have, or >> currently depend on bioperl) as they would depend on a single module, >> rather than the whole package. However, how would the dependencies of >> each module be handled? I'm clearly thinking aloud, but....Maybe this >> would tease apart "cliques" of modules that are interdependent? and >> could in themselves be shipped as bundles e.g. Bio::Graphics and have a >> "master" bioperl bundle that installa all the bioperl modules. > > See my response to Sendu, and Steve Chervitz's original post and related > thread: > > http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315 > > which pretty much covers the same ground. I think at most 4-5 split > 'cliques', including core, with the fewest possible dependencies in > core. If we do any of this, it prob. should wait until after an svn > migration and bugzilla bug stomping unless there is a (well-argued) > advantage to doing it now. > > chris That's fine by me - or should I say, the best way forward - I was really just thinking aloud :) Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix TSi/e8PtYTwpxn6x+ewrjBs= =7Vp1 -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 27 18:43:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 23:43:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> Message-ID: <4682E824.1050507@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >> But the main problem with this approach is that maintenance, global- >> style code improvements and releases become a nightmare. I could, >> perhaps, imagine a scenario where the repository stayed as-is (one >> monolithic collection), but the dist action of Build.PL could be >> altered to generate a release package per module instead of one big >> release package of all modules, as is currently the case. >> >> Is there much value in doing that? Does anyone want me to look into >> the feasibility of such a thing? > > Not for the time being, at least in my opinion. Too much on our > plate at this point with svn migration, test conversion, bugzilla > running over (next point of attack!), etc. Maybe something to think > about after, though I like the idea of a few splits to core as Steve > suggested (SearchIO, Graphics, some LWP-related DB modules). [snip] > If a fix needed to be made in one set, make the fix, test against > bioperl 'base' as a whole, and release when possible. No need to > wait for a full-fledged 1.5.3 release. What advantage is there of these defined splits instead of individual modules? As I see it you lose some of the potential benefits of breaking Bioperl up completely, whilst also suffering the maintenance problems I outlined in my objection to Steve's post. Being able to work on all Bioperl from a single cvs (ne svn) check out/ archive, whilst distributing it as individual modules on CPAN seems like the best of both worlds to me. What am I missing? From hartzell at alerce.com Wed Jun 27 20:41:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:41:01 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> Message-ID: <18051.925.23313.932916@almost.alerce.com> Chris Fields writes: > [...] > We prob. should move it to a new directory ASAP which george can > write to when he needs to update. cvs is in /home/repository/ > bioperl, so maybe something similar, like /home/svn/repository/bioperl? I'd be parsimonious (lazy...) and go for /home/svn/bioperl. g. From hartzell at alerce.com Wed Jun 27 20:46:29 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:46:29 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <18051.1253.87485.235496@almost.alerce.com> Chris Fields writes: > [...] > Now how about a quick straw poll, what kind of access? svn+ssh is > already available, but some (Aaron among them) have indicated they > would like https as well (not sure how involved it would be to set up). What we do here, in large part, depends on what our host machine makes available to us. Is there an apache instance that we can use? Maybe a separate one? May someone among us configure it, or do we need to ask for help? (in other words, does anyone have sudo?) Is there some reason to not include http: (using Digest authentication so that passwords aren't passed in the clear?)? Maybe even go so far as to ask why bother with https:, it's not like we need to transfer any data encrypted.... g. From dmessina at wustl.edu Wed Jun 27 23:02:25 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 22:02:25 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > >> I would think we would want "Author Date Id Rev URL" set on >> everything, no?. So either cvs2svn or your tool (whichever you think >> is better), followed by >> >> svn propset svn:keywords "Author Date Id Rev URL" * > > Shouldn't this be done recursively? Yep, good catch! Thanks, Hilmar. Should be: svn propset --recursive svn:keywords "Author Date Id Rev URL" * From jason at bioperl.org Wed Jun 27 23:29:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:29:09 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.1253.87485.235496@almost.alerce.com> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: I think Chris D and I will need to confer a bit on https+svn. I don't know when we'll have a good chance to discuss everything. At some point this discussion is may need to be taken off bioperl and just the interested parties as we're delving into hardware geek land. The repository machine (dev) is a locked down machine meaning it only really runs ssh and not many servers include httpd. We have anonymous CVS (client and through httpd browsing) running on a separate machine (code) that has the info rsynced over every 10 or 15 minutes. The foundation websites and mailing lists run on a third machine (portal). If we decide to support https we'll need to spend a little time deciding how well we can keep it locked down - it will only be https not http for example and we may want to see about limiting ssh access to everyone if we migrate all OBF projects over to SVN and only support https. Again to re-iterate what I think we would do: - SVN read/write will live on 'dev', _WHEN_ we switch over no writes to the CVS repository. It will be available by ssh+svn and potentially by https+svn - SVN read-only will live on 'code', it will be accessible by http+svn - CVS read-only will live on 'code', this will only be a sync from the SVN to the CVS. See http://svn2cvs.tigris.org/ for details As I tried to ask for in the past, would someone also illustrate the importance of why _WE_ need to switch to SVN on a wiki page on Bioperl so that when someone complains/asks about this in the future the arguments are already laid out. I am basically fine with it, but I don't honestly see a compelling reason beyond what has been mentioned wrt better integration in IDEs. http://bioperl.org/wiki/Why_SVN -jason On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > Chris Fields writes: >> [...] >> Now how about a quick straw poll, what kind of access? svn+ssh is >> already available, but some (Aaron among them) have indicated they >> would like https as well (not sure how involved it would be to set >> up). > > What we do here, in large part, depends on what our host machine makes > available to us. > > Is there an apache instance that we can use? Maybe a separate one? > > May someone among us configure it, or do we need to ask for help? (in > other words, does anyone have sudo?) > > Is there some reason to not include http: (using Digest authentication > so that passwords aren't passed in the clear?)? Maybe even go so far > as to ask why bother with https:, it's not like we need to transfer > any data encrypted.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Wed Jun 27 23:51:32 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:51:32 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Hey guys - I'm wading in a bit late as I haven't had time to keep up with whole discussion. So you are suggesting 800+ individual CPAN modules? I don't think that is a good idea. Why would you split up Bio::Seq::RichSeq and Bio::Seq into two separate packages for example? I think if you really want to move away from the monolithic install it has to be more logical by function - but I am not that optimistic that this is going to actually be easier for people. Maybe I'm misunderstanding. What are the arguments for separating things -- to make it so people aren't scared by the number of modules so they'll code? It seems like some people just want it to be installed and run scripts - does having them install dozens of modules work. Do we need to consider people how much this would suck if someone can't use CPAN or Module::Builder to automate dependancy tracking installation? How does it work when modules are deprecated? I'm not sure I have made up my mind on what I'd like to see, but at some point I think we need to get a clearer idea of what audience we are trying to serve best. If want it to be easy to install maybe we should invest time into making OSX double-click installers, RPMs, and the Windows stuff easily installable. If we want to serve the developers who aren't using SVN so we want to push out releases of modules ASAP? I just am not clear on the motivation for some of the proposed changes. Also - the main point I wanted to make - Can I suggest we spend a little time discussing what it will take to get a stable release for the current code as it stands (bioperl-live and bioperl-run)? It seems like we really need to do this first so that we have a stable release that can be followed by CVS -> SVN migration, then consider major changes to the repository structure and release packaging, and potential deprecation and incorporation of other modules. I assume there is no chance that we'd have a 1.6 candidate by BOSC next month? Will it be productive to schedule a fair amount of time at BOSC discussing how to partition out the packages into separate sub- packages after we've done a successful release rather than trying to change things right now? I realize not everyone will be there but maybe it will be easier to interact on this then. I think it will also be time to talk with Lincoln/Scott about how Gbrowse is structured and if that is working for them. There is too much code in different places that I think we need to figure out how to structure it properly so those packages can be released. It would probably mean moving Bio::Graphics, Bio::DB::GFF and Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages so they could be released more regularly on par with Gbrowse schedules. Also I think someone needs to figure out Bio::Tools::GFF vs Bio::FeatureIO -- what do we want to do? I don't think we really fully support GFF3 that well -- the X2GFF scripts probably need some more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, etc... ) and or migration to the proper GFF writing. -jason On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >>> But the main problem with this approach is that maintenance, global- >>> style code improvements and releases become a nightmare. I could, >>> perhaps, imagine a scenario where the repository stayed as-is (one >>> monolithic collection), but the dist action of Build.PL could be >>> altered to generate a release package per module instead of one big >>> release package of all modules, as is currently the case. >>> >>> Is there much value in doing that? Does anyone want me to look into >>> the feasibility of such a thing? >> >> Not for the time being, at least in my opinion. Too much on our >> plate at this point with svn migration, test conversion, bugzilla >> running over (next point of attack!), etc. Maybe something to think >> about after, though I like the idea of a few splits to core as Steve >> suggested (SearchIO, Graphics, some LWP-related DB modules). > [snip] >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of individual > modules? As I see it you lose some of the potential benefits of > breaking > Bioperl up completely, whilst also suffering the maintenance > problems I > outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ > archive, whilst distributing it as individual modules on CPAN seems > like > the best of both worlds to me. What am I missing? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From chris at bioteam.net Thu Jun 28 00:08:25 2007 From: chris at bioteam.net (Chris Dagdigian) Date: Thu, 28 Jun 2007 00:08:25 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net> My understanding of "https+svn" is that it is actually WebDAV-over- HTTP which means that not only would we need to light up a HTTPD server on the developer box we'd also have to get a stable mod_dav module installed (sometimes not trivial) and then we would have to figure out how to handle the authentication bits. Right now with SSH we use Unix group permissions to figure out who can write to what repository -- WebDAV makes this a lot more complicated. Forcing encryption over https will prevent someone from sniffing a developer password which removes the main security issue. The next problem is going to be integrating the DAV module with Linux PAM so that existing usernames and passwords can be used, -OR- we have to set up and maintain an entirely separate set of username and password maps for each developer and each SVN project. I'm not super concerned about this -- BioTeam runs svn internally and we expose our SVN for employees both via WebDAV and SVN+SSH - it's not that hard to set up. My biggest concern really has to do with how much extra work this will mean for the OBF sysadmin team. If there is an easy way to get a stable Apache/DAV/SVN integration going with authentication coming from Linux PAM then this is no big deal. If we have to manually maintain separate authentication lists then it will be kind of a hassle. Like Jason mentioned, the OBF currently segregates "stuff" onto three different servers with three levels of security: - dev.open-bio.org -- Developers only, SSH access only (main sourcecode repository for OBF) - portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers and helpdesk.open-bio.org - code.open-bio.org -- "Disposable" anonymous access server that we can easily burn/wipe/reinstall if it ever gets hacked Everything else that Jason mentioned is fine and easy to set up (if not already running): - SVN+SSH for developers - Anonymous SVN and Anonymous RSYNC for community access on code.open-bio.org - svn2cvs for whomever wants it on code.open-bio.org - web based SVN code browser installed on http://code.open-bio.org Regards, Chris On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote: > I think Chris D and I will need to confer a bit on https+svn. I > don't know when we'll have a good chance to discuss everything. At > some point this discussion is may need to be taken off bioperl and > just the interested parties as we're delving into hardware geek land. > > The repository machine (dev) is a locked down machine meaning it > only really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or > 15 minutes. The foundation websites and mailing lists run on a > third machine (portal). > > > If we decide to support https we'll need to spend a little time > deciding how well we can keep it locked down - it will only be > https not http for example and we may want to see about limiting > ssh access to everyone if we migrate all OBF projects over to SVN > and only support https. > > Again to re-iterate what I think we would do: > - SVN read/write will live on 'dev', _WHEN_ we switch over no > writes to the CVS repository. It will be available by ssh+svn and > potentially by https+svn > - SVN read-only will live on 'code', it will be accessible by http > +svn > - CVS read-only will live on 'code', this will only be a sync from > the SVN to the CVS. See http://svn2cvs.tigris.org/ for details > > > As I tried to ask for in the past, would someone also illustrate > the importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the > future the arguments are already laid out. I am basically fine > with it, but I don't honestly see a compelling reason beyond what > has been mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN > > -jason > On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > >> Chris Fields writes: >>> [...] >>> Now how about a quick straw poll, what kind of access? svn+ssh is >>> already available, but some (Aaron among them) have indicated they >>> would like https as well (not sure how involved it would be to >>> set up). >> >> What we do here, in large part, depends on what our host machine >> makes >> available to us. >> >> Is there an apache instance that we can use? Maybe a separate one? >> >> May someone among us configure it, or do we need to ask for help? >> (in >> other words, does anyone have sudo?) >> >> Is there some reason to not include http: (using Digest >> authentication >> so that passwords aren't passed in the clear?)? Maybe even go so far >> as to ask why bother with https:, it's not like we need to transfer >> any data encrypted.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > From cjfields at uiuc.edu Thu Jun 28 00:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 23:18:03 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: > Chris Fields wrote: > ... >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of > individual modules? As I see it you lose some of the potential > benefits of breaking Bioperl up completely, whilst also suffering > the maintenance problems I outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ archive, whilst distributing it as individual modules on CPAN > seems like the best of both worlds to me. What am I missing? Okay, forewarned, but here's my long-winded reasoning. The short and sweet version: I (very) respectfully don't agree with you, at least re: the idea we should commit all modules to CPAN independently. It doesn't make any sense to me, but maybe you can elaborate more? Maybe I'm misinterpreting what you mean? Also, I agree with Steve C. that core is anything but a representation of a 'core' set of modules, and some sections could (should?) be split off into discrete, cohesive units. We may be alone in that camp, though it doesn't seem so (it's popped up more than a few times, in one form or another). If you want an in-depth explanation for both opinions, read on (below my sig), or feel free to bypass it. I'll understand. Finally, all of this should wait until later. Much later, like after a decent release, after svn, etc kind of 'later'. I think we can agree on that. . . . . . Still here? Okay... each issue (skip as needed): Individual CPAN modules: CPAN is not our personal versioning system; it may be if a distribution consists of only a few modules, but not when it's one of the largest distros present. If someone wants to update an individual bioperl module for a quick bug fix they are more than welcome to download it via cvs, svn, or even using a web browser, and replace the one they have. In most cases, it works w/o problems. With Module::Build you have even made it easier if a full installation is necessary. I'm trying to reason how one could break up the individual SeqIO/ SearchIO/otherIO modules into single module distributions. They are intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, which relies on the various interfaces, RootIO, and on down). How would tests be run off CPAN when the modules are distributed independently? Would they also be individually distributed? What would you use to tie all the individual modules together? How would you explain to the CPAN maintainers that you want to split bioperl into 990 individual modules, all updated independently, but intend on bundling them afterwards anyway? I'm failing to see the advantages to this approach, but if you can find an example where this was done successfully on CPAN or elsewhere maybe I could see what you mean. Splitting up core: As I see it, here are the advantages of a defined split as Steve and I see it (off the top of my head). Some of this probably reiterates my previous points, as well as Steve's, so apologies in advance. - A lean, mean, focused set of bioperl base modules (core) w/o or with very few external deps, minimal installation issues, etc. The very basic stuff to get up and running. - BioPerl bundled modules (Nathan's 'cliques') with defined, focused functionality, code, and tests, which add a bit more 'sugar' to the base functionality of the core. If you only care about parsing BLAST reports, get SearchIO, which requires core and optionally other modules (XML::SAX). If you want additional DB functionality apart from the very basic ones in core, install DB (with it's additional requirements, including core, DBI, and so on). Same with Graphics, Tools, Tree/Phylo, etc. We just need to define and limit the number of splits. - Easier to add additional bundled modules. For instance, I could focus all of my RNA work into a discrete set of modules (say, bioperl- rna) which I maintain, I ensure works with the latest core code, I ensure also plays well with the other children =) , and I distribute via CPAN. Same with EUtilities, which could go into a separated DB- related set or stay in core. - If we want a full-fledged 'install everything', the CPAN Bundle system is available. I think it's easier to use a Bundle for 4-5, even 10 groups of modules as opposed to over 900. - A Bundle or a build file where discrete distributions are listed (Bio::SearchIO, etc) wouldn't need to be updated every time a new module is added to a distribution. I suppose this could be automated, but why have the additional headache? - A chance to cut out some cruft. We all know that particular areas need work or a complete overhaul (Restriction, Structure, maybe a few others). Smaller, concentrated sets of modules I believe would be easier to maintain, and those that don't get use will eventually fall out of favor and may be lost or replaced from the more maintained group of modules. Survival of the fittest. - We already have had practice; bioperl-db, bioperl-run, bioperl- network, and others. Those that have been routinely maintained and enjoy wide use (db, run, network) have survived; others not so much (corba-related stuff, microarray, ext, etc., though the code is still available if someone else wants to take it up and revive it!). Disadvantages of a defined split: - The initial headache of identifying which groups go where, coordinating with those who rely on bioperl (GMOD, etc) on how this will be set up, so on... - Separate groups of modules require testing together to ensure functionality is consistent and maintained (something I think you pointed out previously). - I think an increased possibility of branching is possible. - Extra headaches for devs, who have to keep track of the various critical distributions and make sure they work well together. - Maybe others, but it's getting late here. Add more as needed; I'm sure there are a number more. chris From cjfields at uiuc.edu Thu Jun 28 01:17:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 00:17:01 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu> D'oh! Just when I wanted to go to bed. It's not fair, you're in California... On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote: > Hey guys - I'm wading in a bit late as I haven't had time to keep up > with whole discussion. > > So you are suggesting 800+ individual CPAN modules? I don't think > that is a good idea. Why would you split up Bio::Seq::RichSeq and > Bio::Seq into two separate packages for example? I think if you > really want to move away from the monolithic install it has to be > more logical by function - but I am not that optimistic that this is > going to actually be easier for people. Maybe I'm misunderstanding. Okay, so maybe it wasn't just me. > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? What I envision for core is maybe not just one distribution, but a cluster of distributions: base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated modules. Bare bones, with as few dependencies as possible. aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires additional modules. search - Bio::Search and SearchIO tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related stuff? graphics - Bio::Graphics. Maybe GMOD-related stuff here? The last four would list bioperl-core as a dependency themselves along with any other modules necessary. We could also have the core Build.PL ask the user if they want to install the other non-base distros, and maybe include bioperl-db, bioperl-network, and bioperl- run in the loop if requested. All would be installed as a bundle similar to Bundle::BioPerl, but have regular CPAN point releases (1.x.x) independently from one another i.e. for bug fixes, with a yearly/biyearly timed full release (1.x) of the whole shebang. Any point release for any 'core' distribution would have to be tested against the others prior to release. This is basically following Steve's train of thought, though more elaborated: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 > I'm not sure I have made up my mind on what I'd like to see, but at > some point I think we need to get a clearer idea of what audience we > are trying to serve best. If want it to be easy to install maybe we > should invest time into making OSX double-click installers, RPMs, and > the Windows stuff easily installable. If we want to serve the > developers who aren't using SVN so we want to push out releases of > modules ASAP? I just am not clear on the motivation for some of the > proposed changes. I think regular CPAN releases with updated PPMs hosted via portal work fine for the most part, but it would be nice to host RPMs. Others (Allen Day, for instance) have donated time to generate RPMs but they seem to lag behind a bit more. The original idea for svn arose from an unrelated thread with Mark Johnson discussing something (Glimmer maybe?) and took off from there. I was actually pretty surprised it took on a life of it's own. As for the motivation to switch, I haven't specifically used it myself, but the large number of responses seem to indicate others have and seem happy with it. Rutger Vos had also indicated he would move Bio::Phylo over to the repo if we used svn. We def. should address the issues you bring up (why _WE_ need svn) more succinctly but that shouldn't be an issue. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. Agreed. We prob. need to schedule a good couple of days (or so) to squash bugs. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? Um, not likely as nothing has been addressed Feature/Annotation-wise (overloads are still there, methods have not been deprecated, etc). There was an underlying assumption these would have an effect on GMOD- related stuff (I remember reading a post from Scott Cain in the mail archive mentioning something along these lines after the 1.5 release hubbub). Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall? > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I realize not everyone will be there but > maybe it will be easier to interact on this then. How many are going to be there? I can't go this year except on my own dime (which I don't have many of, student loans and all, sorry), though I'll likely be in a new lab by spring which is likely more amenable to funding. If there is a hackathon in the late fall (post- sept) I'll make it a point to go regardless. > I think it will also be time to talk with Lincoln/Scott about how > Gbrowse is structured and if that is working for them. There is too > much code in different places that I think we need to figure out how > to structure it properly so those packages can be released. It would > probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I don't think we really > fully support GFF3 that well -- the X2GFF scripts probably need some > more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, > etc... ) and or migration to the proper GFF writing. > > > -jason Will Lincoln or Scott be at BOSC? chris From dmessina at wustl.edu Thu Jun 28 01:21:58 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 00:21:58 -0500 Subject: [Bioperl-l] finding statistics on AA In-Reply-To: <4681F4B4.8010609@pacific.net.sg> References: <4681F4B4.8010609@pacific.net.sg> Message-ID: Hi Melvin, I don't think BioPerl has any information content-related code. I'm not terribly familiar with it myself, but the usual recommendation is to look at the EMBOSS package: http://en.wikipedia.org/wiki/EMBOSS Dave From bix at sendu.me.uk Thu Jun 28 02:38:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 07:38:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <46835778.5070901@sendu.me.uk> Jason Stajich wrote: > So you are suggesting ou are suggesting 800+ individual CPAN modules? > I don't think that is a good idea. Why would you split up > Bio::Seq::RichSeq and Bio::Seq into two separate packages for > example? I think if you really want to move away from the monolithic > install it has to be more logical by function - but I am not that > optimistic that this is going to actually be easier for people. > Maybe I'm misunderstanding. > > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? See my upcoming reply to Chris. Briefly, if the only change is to the dist action of Build.PL, we can make a single archive of all modules available to non-CPAN users, and individual modules available to CPAN users. No problems. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I'd recommend that a 'stable' release shouldn't happen until we resolve all the missing tests and bugzilla bugs (because I think the opportunity should be taken to have it stable both in terms of interface /and/ bugs). Which is a lot of work. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? None. From bix at sendu.me.uk Thu Jun 28 03:25:03 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 08:25:03 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <4683624F.6020402@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >> What advantage is there of these defined splits instead of >> individual modules? As I see it you lose some of the potential >> benefits of breaking Bioperl up completely, whilst also suffering >> the maintenance problems I outlined in my objection to Steve's post. >> >> Being able to work on all Bioperl from a single cvs (ne svn) check >> out/ archive, whilst distributing it as individual modules on CPAN >> seems like the best of both worlds to me. What am I missing? > > Okay, forewarned, but here's my long-winded reasoning. The short and > sweet version: I (very) respectfully don't agree with you, at least > re: the idea we should commit all modules to CPAN independently. It > doesn't make any sense to me, but maybe you can elaborate more? > Maybe I'm misinterpreting what you mean? The short and sweet version: my proposal has all the benefits of yours, but none of the disadvantages. What's not to like? > Finally, all of this should wait until later. Much later, like after > a decent release, after svn, etc kind of 'later'. I think we can > agree on that. Hmm, not really. If it can be implemented by a change in just Build.PL and ModuleBuildBioperl, its really independent of everything else. That's the beauty of it: the only thing that changes is how things are uploaded to and downloaded from CPAN. The only person that normally deals with that issue is the pumpkin for a release, and he only cares about it at release time. In fact, if we're going to do it at all it makes sense to try it out on a minor release like 1.5.3. We've already got experience of doing it split-style from 1.5.2. (And let me tell you: splits at the code-base level suck.) > Individual CPAN modules: > > CPAN is not our personal versioning system; it may be if a > distribution consists of only a few modules, but not when it's one of > the largest distros present. If someone wants to update an > individual bioperl module for a quick bug fix they are more than > welcome to download it via cvs, svn, or even using a web browser, and > replace the one they have. And where is the harm in letting them do it via CPAN as well? In fact, there are significant benefits: > I'm trying to reason how one could break up the individual SeqIO/ > SearchIO/otherIO modules into single module distributions. They are > intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, > which relies on the various interfaces, RootIO, and on down). How > would tests be run off CPAN when the modules are distributed > independently? Bio::SeqIO::genbank would have a dependency on the latest version of Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. So when a user wants to get the latest version of Bio::SeqIO::genbank, they no longer have to worry about what other modules in its dependency hierarchy they should also install. Instead they just request Bio::SeqIO::genbank which itself ensures you have the latest version of all its dependencies before installing itself and running its tests. When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank users should have, he could just call './Build dist Bio::SeqIO::genbank' which would generate a new package for Bio::SeqIO::genbank suitable for uploading to CPAN. No more long release cycles and having to constantly tell people to 'use CVS' to get working Bioperl code. > Would they also be individually distributed? What > would you use to tie all the individual modules together? How would > you explain to the CPAN maintainers that you want to split bioperl > into 990 individual modules, all updated independently, but intend on > bundling them afterwards anyway? They would be tied together by a CPAN bundle. You don't have to 'explain' anything to the CPAN maintainers because you're not doing anything wrong. In fact, you're using it the way you're supposed to. > Splitting up core: > > As I see it, here are the advantages of a defined split as Steve and > I see it (off the top of my head). Some of this probably reiterates > my previous points, as well as Steve's, so apologies in advance. Below I answer with how it would be with my single-module approach compared to the defined splits. > - A lean, mean, focused set of bioperl base modules (core) w/o or > with very few external deps, minimal installation issues, etc. The > very basic stuff to get up and running. Even leaner, even more focused. > - BioPerl bundled modules (Nathan's 'cliques') with defined, focused > functionality, code, and tests, which add a bit more 'sugar' to the > base functionality of the core. If you only care about parsing BLAST > reports, get SearchIO, which requires core and optionally other > modules (XML::SAX). If you want additional DB functionality apart > from the very basic ones in core, install DB (with it's additional > requirements, including core, DBI, and so on). Same with Graphics, > Tools, Tree/Phylo, etc. We just need to define and limit the number > of splits. The same can be achieved with CPAN bundles for each kind of functional grouping you can think of. And since its just a single text file that defines such a grouping, its easy to change or add new ones as you feel like it, as opposed to the rather more permanent and substantial effort of creating one of your splits on the code-base level. Also, the world doesn't have to rely on /our/ ideas of what a useful functional split is. If someone just wants to parse Blast results, they can just use CPAN to install Bio::SearchIO::blast_pull instead of having to install all of SearchIO. > - Easier to add additional bundled modules. For instance, I could > focus all of my RNA work into a discrete set of modules (say, bioperl- > rna) which I maintain, I ensure works with the latest core code, I > ensure also plays well with the other children =) , and I distribute > via CPAN. Same with EUtilities, which could go into a separated DB- > related set or stay in core. And if you lose interest in them? They eventually die because they no longer have someone looking after them by default (the pumpkin and other devs). Alternatively you could just make a CPAN bundle. One text file! Easy! No duplication of modules in CPAN, no new hassle for you or the Bioperl 'core' pumpkin to ensure that the latest version of each work with each other and other splits. > - If we want a full-fledged 'install everything', the CPAN Bundle > system is available. I think it's easier to use a Bundle for 4-5, > even 10 groups of modules as opposed to over 900. No, it isn't any easier. Its /equally/ easy to install a bundle of 900 packages of 900 modules as it is to install 5 packages of 900 modules. When not installing absolutely everything, but perhaps 'most' things, there's the additional benefit that it would be easier to skip a particular Bio::module because you didn't want to install its external dependencies and weren't that interested in it anyway. > - A Bundle or a build file where discrete distributions are listed > (Bio::SearchIO, etc) wouldn't need to be updated every time a new > module is added to a distribution. I suppose this could be > automated, but why have the additional headache? Yes, it would be automated, and no, it wouldn't at all be any kind of additional headache. I'm proposing a fully-automated system that the pumpkin wouldn't even have to think about it. Much /less/ of a headache than dealing with splits. Orders of magnitude easier to deal with. > - A chance to cut out some cruft. We all know that particular areas > need work or a complete overhaul (Restriction, Structure, maybe a few > others). Smaller, concentrated sets of modules I believe would be > easier to maintain, and those that don't get use will eventually fall > out of favor and may be lost or replaced from the more maintained > group of modules. Survival of the fittest. And the smallest, most concentrated set of modules is the individual module. > - We already have had practice; bioperl-db, bioperl-run, bioperl- > network, and others. Those that have been routinely maintained and > enjoy wide use (db, run, network) have survived; others not so much > (corba-related stuff, microarray, ext, etc., though the code is still > available if someone else wants to take it up and revive it!). The reason some of these existing splits (micoarray, ext) have fallen by the way-side? /Because/ they're splits. If they had been part of bioperl-live all along, they'd have been kept in a working, compatible state and would have been released along with everything else in 1.5.2 > Disadvantages of a defined split: > > - The initial headache of identifying which groups go where, > coordinating with those who rely on bioperl (GMOD, etc) on how this > will be set up, so on... No need to worry about this with individual modules. > - Separate groups of modules require testing together to ensure > functionality is consistent and maintained (something I think you > pointed out previously). No need to worry. > - I think an increased possibility of branching is possible. > > - Extra headaches for devs, who have to keep track of the various > critical distributions and make sure they work well together. No headaches. From charles-listes+bioperl at plessy.org Thu Jun 28 03:40:04 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 16:40:04 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? Message-ID: <20070628074004.GD6338@kunpuu.plessy.org> Dear developpers, I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if it would make sense to call it "bioperl-live" and distribute it in parallel with the stable 1.4.0 version, if bioperl-live means "the current developepr version". If I am wrong, can somebody explain me what bioperl-live exactly refers to ? Have a nice day, -- Charles Plessy Debian-med packaging team Wako, Saitama, Japan From n.haigh at sheffield.ac.uk Thu Jun 28 04:23:10 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:23:10 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46836FEE.5030203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. This was my thinking when I first brought this up at the begining/splitting of this thread. This way of thinking of modules as the constituent parts of a larger package should make it easier for people to define dependencies far easier as well as users only needing to install those parts they require. As Sendu points out, if the user wants to convert seqs from genbank to fasta they could simply install Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the other modules that are the dependencies of Bio::SeqIO::genbank and Bio::SeqIO::fasta. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. However, how would the test suite work out with this? e.g. when someone installs Bio::SeqIO::genbank they want to have the tests associated with Bio::SeqIO::genbank to be run. Would there be tests that would be run redundantly if for example someone installed Bio::SeqIO::genbank and Bio::SeqIO::fasta? > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. Yep. real modules are released as modules, each with their own set of dependencies. The use CPAN bundles the way there were supposed to be for - - distributing a set of CPAN modules that make a coherent set of functionality. You "could" also bundle in other authors modules e.g. Bio::ASN1::EntrezGene? > > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. Hmm, how would module versions be handled? Wouldn't this approach require each module to have it's own independent version number, which could then be used for building the dependencies? Each new release of that module would only bump that module's version number. Bundles can specify the minimum version of a module to be installed, such that bug fixes to individual modules and be released into CPAN and would automatically get picked up when installing bundles etc. I'm not quite sure how the current stable/dev releases would work. I assume bug fixes would have to be made on a branch e.g. branch 1.6 and released to cpan from there. Then when the next stable release is made, all module versions would be bumped and and released to CPAN. With any modifications to the content of the bundle to be made. Is it possible to have a stable and developer release bundles that are able to specify the minimum stable and developer modules versions respectively? > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. Maye need to worry aout how the tests are run when installing individual modules etc? > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT VkymyXNshguE44/RilEXWDA= =O5ex -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 04:27:54 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:27:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683710A.9010808@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. > The successor to Bundles - may prove interesting: http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r r/BykCKbM9lqJM0khARuEms= =NB4B -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 04:51:19 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:51:19 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837687.7010101@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Charles Plessy wrote: > Dear developpers, > > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? > > Have a nice day, > bioperl-live really means the HEAD of the cvs repository so is the most bleeding-edge code available. Version 1.5.* is the developer release, while the 1.4.* is the stable release. However, there have been few updates to the 1.4.* release which means that it is more unstable than the 1.5.* dev release. I think the consensus, was to have more rapid release cycles of the stable branch in future in order to avoid this. I'm sure there are others more qualified to expand/correct me on this if needs e. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB /fHFyYkqAvcmOSxu4djPll0= =KwVH -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 05:11:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 10:11:39 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <46836FEE.5030203@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk> Message-ID: <46837B4B.7060705@sendu.me.uk> Nathan S. Haigh wrote: (Please try and snip more: don't quote whole posts just to reply to certain paragraphs) > Sendu Bala wrote: >> Chris Fields wrote: >> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank >> users should have, he could just call './Build dist Bio::SeqIO::genbank' >> which would generate a new package for Bio::SeqIO::genbank suitable for >> uploading to CPAN. No more long release cycles and having to constantly >> tell people to 'use CVS' to get working Bioperl code. > > However, how would the test suite work out with this? e.g. when someone > installs Bio::SeqIO::genbank they want to have the tests associated with > Bio::SeqIO::genbank to be run. Would there be tests that would be run > redundantly if for example someone installed Bio::SeqIO::genbank and > Bio::SeqIO::fasta? We would want to move to a strict test-script-per-module system. But that's desirable in any case, as it would greatly ease reaching our goal of complete test coverage, and subsequent maintenance of those tests. The genbank test would only run tests specific to genbank parsing, and likewise for fasta. They would both have a dependency on Bio::SeqIO, and if that was also recently updated, it would get installed prior to you installing genbank (and therefor run its own generic SeqIO tests), but wouldn't get installed again (wouldn't run its tests again) when you install fasta afterwards. On the subject of tests, I'm reminded of another benefit of the individual-module approach. Currently if a test fails during a CPAN install, nothing gets installed. Users do one of: # refuse to install at all (strict sys-admins) # cry and give up (newbies) # cry and seek help (newbies who really really need Bioperl) # force install, leaving them in some undefined state because they didn't understand the problems (most remaining users) # force install, happy that the problems are ok (some Bioperl devs) With a bundle of individual modules you would install virtually all Bioperl modules with no problems, and the problems with the remainder would be clear to everyone. No one would need to force install since the tests results would now be meaningful: the thing you're trying to install really isn't going to work if the tests are failing. If you really needed that particular Bioperl module you could then pay particular attention to why its failing (most likely some problem with an external dependency). >>> Would they also be individually distributed? What would you use to >>> tie all the individual modules together? >> >> They would be tied together by a CPAN bundle. You don't have to >> 'explain' anything to the CPAN maintainers because you're not doing >> anything wrong. In fact, you're using it the way you're supposed to. > > Yep. real modules are released as modules, each with their own set of > dependencies. The use CPAN bundles the way there were supposed to be for > - - distributing a set of CPAN modules that make a coherent set of > functionality. You "could" also bundle in other authors modules e.g. > Bio::ASN1::EntrezGene? Any bundle featuring Bio::SeqIO::entrezgene would necessarily include Bio::ASN1::EntrezGene in the bundle. > Hmm, how would module versions be handled? Wouldn't this approach > require each module to have it's own independent version number, which > could then be used for building the dependencies? Each new release of > that module would only bump that module's version number. Yes, that's how it would work. No more global version number. > Bundles can specify the minimum version of a module to be installed, > such that bug fixes to individual modules and be released into CPAN and > would automatically get picked up when installing bundles etc. Yes. > I'm not quite sure how the current stable/dev releases would work. I > assume bug fixes would have to be made on a branch e.g. branch 1.6 and > released to cpan from there. Then when the next stable release is made, > all module versions would be bumped and and released to CPAN. With any > modifications to the content of the bundle to be made. Is it possible to > have a stable and developer release bundles that are able to specify the > minimum stable and developer modules versions respectively? No, the distinction becomes pretty meaningless. We could still do big major releases, but modules wouldn't be version-bumped. The big release would just be an update of the bundle that specifies the latest version of all Bioperl modules. Remember that bundles only specify the minimum version, not the required version: in this brave new world users would end up with the same versions of modules if they installed a 1.8 bundle compared to 1.7 bundle. The only way to get a true snapshot of 1.7 after it was released would be if we took snapshots and archived them, making them available from bioperl.org (or by checking out the 1.7 tag from cvs/svn). I don't see that as a significant problem. You lose the trivial benefit of being able to install old snapshots from CPAN. The people who have a great need to install old snapshots can find their way to bioperl.org no problem. From bix at sendu.me.uk Thu Jun 28 04:50:09 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 09:50:09 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837641.8050106@sendu.me.uk> Charles Plessy wrote: > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? bioperl-live is the name of the CVS repository containing what is currently considered the 'Core package' or core modules. http://www.bioperl.org/wiki/Using_CVS If you want to call it something to distinguish it from stable, call it 'developer' vs 'stable' or '1.5.2' vs '1.4.0'. To distinguish them both from the other packages, call them 'core' vs 'run' etc. From hlapp at gmx.net Thu Jun 28 06:31:29 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:31:29 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > [...] Also - the main point I wanted to make - Can I suggest we > spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I agree we need to discuss a path towards 1.6, but I think that should be kept separate from the cvs->svn migration. Otherwise one stalls the other (by stopping people who seem to have the energy and motivation right now to do one but not the other) for no really good reason. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? I'm not sure that's feasible to be happening but if someone steps up it maybe it is. > > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I agree. I also don't think that people are partitioning right now (other than the existing partitioning), though maybe I'm mistaken. > [...] > It would probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Possibly. I'm not fully sure why those modules couldn't also be released more often out of the "main trunk" of modules. In Java/ant, it'd be relatively easy to write build script filters that select the appropriate modules and package them on the fly. I'm not sure whether the build tools for Perl can do that too, though. > Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I believe FeatureIO has the ontology download tied into it? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Jun 28 06:47:39 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:47:39 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote: > As I tried to ask for in the past, would someone also illustrate the > importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the future > the arguments are already laid out. I am basically fine with it, but > I don't honestly see a compelling reason beyond what has been > mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN I guess at the end of the day svn is just the system of choice for new developers. I've had people tell me who started with svn that cvs seems a lot harder to use. The newer projects are all on svn and for example to integrate Bio::Phylo into BioPerl should become a question of the revision control system. At the end of the day if being on svn makes it easier for new people to contribute it's enough of an argument for me, whether it's rational or not. IMHO, there's two advantages that svn has over cvs. First, directories are versioned, have properties, and generally are the same class of citizens as files. They can be added, renamed, and removed from the repository. In cvs, we all know what a hassle it is to rename or even retire directories. Second, svn log gives you the commits, i.e., the set of changes that constituted one particular commit (and therefore version increase). In cvs that's hard or impossible to reconstruct. Bottom line - I don't think many people if any will question why we moved from cvs to svn ... My $0.02 ... -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 20:34:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:34:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> Message-ID: <18051.541.684705.567954@almost.alerce.com> Chris Fields writes: > We should port them all, yes. > > chris > > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > > > Is there a reason not to port every subproject over? > > > > -hilmar They're all there. At least everything that I found in the CVS repo. Some of the directories were empty, some had very little content, I was just mechanical about it. Here's what I have: [hartzell at dev ~]$ svn ls file://`pwd`/bioperl biodata/ bioperl-cookbook/ bioperl-corba-client/ bioperl-corba-server/ bioperl-das-client/ bioperl-db/ bioperl-ext/ bioperl-gui/ bioperl-live/ bioperl-microarray/ bioperl-network/ bioperl-papers/ bioperl-pedigree/ bioperl-pipeline/ bioperl-run/ biosql-schema/ html/ task-manager/ xml-html/ I wasn't very clear in my original request, but I was hoping that someone out there who's familiar with the various out-of-the-way bits and pieces could take a look at them. I was afraid that everyone was just checking out bioperl-live and doing 'make test'. Someone (chris?) made a point about binary files in bioperl-run. It'd be great if someone in the know could check on them. Also, to the degree that it's possible, look around at various tags and branches and see if they're what you'd expect. Thanks! g. From bix at sendu.me.uk Thu Jun 28 08:21:37 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 13:21:37 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <4683A7D1.8070403@sendu.me.uk> George Hartzell wrote: > Chris Fields writes: > > [...] > > It looks like George Hartzell may be taking a crack at it, with > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > could have something testable relatively soon. After that we'll need > > to work out a few other issues, basically what's on Hilmar's list. > > There's a repository on file:///home/hartzell/bioperl with all of the > components projects in place. > > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl I'm confused. Presumably that only works whilst logged into dev.open-bio.org? > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I just tried: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl on Mac OS X and things seemed to go well, except for this error message at the end: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory I also ended up with only: bioperl-corba-server bioperl-db bioperl-live bioperl-network bioperl-papers biosql-schema Am I doing something totally wrong here? From hartzell at alerce.com Thu Jun 28 08:32:36 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:32:36 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.43620.481558.447399@almost.alerce.com> Jason Stajich writes: > [...] > The repository machine (dev) is a locked down machine meaning it only > really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or 15 > minutes. A great way to provide a read-only mirror of the repos. for anonymous users is to have svnsync running out of cron on code.open-bio.org, configured to pull from the dev.open-bio.org repository. It might actually work to have rsync mirror the fsfs-backed repository, but that's scary-poking-into-the-internals. g. From hartzell at alerce.com Thu Jun 28 08:43:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:43:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18051.44281.831316.749586@almost.alerce.com> David Messina writes: > > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > > > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > > > >> I would think we would want "Author Date Id Rev URL" set on > >> everything, no?. So either cvs2svn or your tool (whichever you think > >> is better), followed by > >> > >> svn propset svn:keywords "Author Date Id Rev URL" * > > > > Shouldn't this be done recursively? > > > Yep, good catch! Thanks, Hilmar. > > Should be: > > svn propset --recursive svn:keywords "Author Date Id Rev URL" * That's not quite what you want either. It'll set the the keyword property on all of the files, including things where you probably don't want expansion to happen (e.g. images, someone said there are binary wads in bioperl-run, etc...). The Right Thing To Do is to grub around (grep) for '\$Id:' (and the others) and set svn:keywords to files that are already using keywords. I have a bourne shell hack that'll do this, although it's painful because it has to run in working directories.... Once we settle on a list of keywords to use, I'll take a wack at the demo repository. Likewise, you probably DON'T want to use this in your config file: enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" since it'll do the same thing. The Right Thing To Do is a more tedious *.pl = svn:keywords="Author Date Id Rev URL" *.pm = svn:keywords="Author Date Id Rev URL" *.c = svn:keywords="Author Date Id Rev URL" A bit of googling will give you a good starting point for the list, and we should probably maintain a common one somewhere in the repo. I don't think that there's a server side way of doing this, short of running some script via a hook around commit time. g. From hartzell at alerce.com Thu Jun 28 08:54:40 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:54:40 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.44944.982207.37624@almost.alerce.com> Hilmar Lapp writes: > [...] > IMHO, there's two advantages that svn has over cvs. First, > directories are versioned, have properties, and generally are the > same class of citizens as files. They can be added, renamed, and > removed from the repository. In cvs, we all know what a hassle it is > to rename or even retire directories. Second, svn log gives you the > commits, i.e., the set of changes that constituted one particular > commit (and therefore version increase). In cvs that's hard or > impossible to reconstruct. Two more: - svn groups changes into revisions, so that they can be considered together, CVS versions individual files. - subversion tracks renames/moves correctly, - subversion commits are atomic, so you never have to worry about all of your stuff making it into the repos. at the same time [if you've never had to un-muck this, count yourself blessed!] , - svk, which allows disconnected development while still commiting your work to a repo at natural points along the way (you can revert, branch, etc.... to your hearts content). [yeah, that's 3, err, 4. Math is hard.] g. From cjfields at uiuc.edu Thu Jun 28 09:07:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:07:24 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu> On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > >> ...It >> seems like we really need to do this first so that we have a stable >> release that can be followed by CVS -> SVN migration, then consider >> major changes to the repository structure and release packaging, and >> potential deprecation and incorporation of other modules. > > I agree we need to discuss a path towards 1.6, but I think that > should be kept separate from the cvs->svn migration. Otherwise one > stalls the other (by stopping people who seem to have the energy and > motivation right now to do one but not the other) for no really good > reason. It's good to discuss it as long as it doesn't take time and energy away from other priorities. >> I assume there is no chance that we'd have a 1.6 candidate by BOSC >> next month? > > I'm not sure that's feasible to be happening but if someone steps up > it maybe it is. Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after. Then maybe work on partitioning if everyone's up for it and a scheme is worked out. >> Will it be productive to schedule a fair amount of time at BOSC >> discussing how to partition out the packages into separate sub- >> packages after we've done a successful release rather than trying to >> change things right now? > > I agree. I also don't think that people are partitioning right now > (other than the existing partitioning), though maybe I'm mistaken. The original proposal was based on Steve's idea of splitting up core. I don't think a partition is feasible at this point, at least until we put more thought into it (our energy should be focused elsewhere), but it's well worth discussing as a future path. At this time there are two proposals: 1) Steve's and my 'split into discrete sections' proposal, where we split core into self-sustaining sections with a common core listed as a dependency, tying installation of all together with a Bundle or similar. 2) Sendu's 'break everything up' approach where all modules are submitted independently to CPAN, with their own tests, dependencies, etc. There are advantages and disadvantages to both approaches. Not sure if CPAN would go for the latter (it's pretty drastic), but I don't know for sure. If you want in on that discussion (in this thread) feel free to join in! The more the merrier! >> [...] >> It would probably mean moving Bio::Graphics, Bio::DB::GFF and >> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages >> so they could be released more regularly on par with Gbrowse >> schedules. > > Possibly. I'm not fully sure why those modules couldn't also be > released more often out of the "main trunk" of modules. In Java/ant, > it'd be relatively easy to write build script filters that select the > appropriate modules and package them on the fly. I'm not sure whether > the build tools for Perl can do that too, though. Both approaches above would probably use Module::Build to install other bioperl dependencies, each of which could have it's own dependency set, possibly using a Bundle to tie everything together. >> Also I think someone needs to figure out Bio::Tools::GFF >> vs Bio::FeatureIO -- what do we want to do? > > I believe FeatureIO has the ontology download tied into it? > > -hilmar From recent posts here and on the gbrowse mail list by Scott and Lincoln, it seemed like they were moving away from using Bio::DB::GFF and were trying to get users to switch to Bio::DB::SeqFeature. Maybe should get a more direct response? chris From hartzell at alerce.com Thu Jun 28 09:16:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:16:18 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.46242.942184.758493@almost.alerce.com> Sendu Bala writes: > George Hartzell wrote: > > Chris Fields writes: > > > [...] > > > It looks like George Hartzell may be taking a crack at it, with > > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > > could have something testable relatively soon. After that we'll need > > > to work out a few other issues, basically what's on Hilmar's list. > > > > There's a repository on file:///home/hartzell/bioperl with all of the > > components projects in place. > > > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, that only works if you're actually on the machine. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? It looks like you tried to check out the *entire* repository. It never occured to me to try that. I'll take a look at what you reported. g. From bix at sendu.me.uk Thu Jun 28 09:20:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:20:19 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.46242.942184.758493@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> Message-ID: <4683B593.3050108@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: >> I just tried: >> >> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl [snip] > It looks like you tried to check out the *entire* repository. Yes. If you don't want everything, how does one 'browse' the repository to find out the address of the thing you /do/ want? > It never occured to me to try that. I'll take a look at what you > reported. Cheers. From bix at sendu.me.uk Thu Jun 28 09:27:29 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:27:29 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <4683B741.5020600@sendu.me.uk> George Hartzell wrote: > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? It would be great to have the following files svn:ignored : In all package roots: ? Build ? MANIFEST ? MANIFEST.SKIP ? META.yml ? _build ? bioperl-*.tar.bz2 ? bioperl-*.tar.gz ? bioperl-*.zip ? blib ? cover_db In any and all directories: ? .DS_Store ? .DAV In bioperl-live: ? t/BioDBSeqFeature.t ? t/BioDBSeqFeature_BDB.t ? t/BioDBSeqFeature_mysql.t Can't think of anything else right now. Thanks for your efforts, Sendu. From cjfields at uiuc.edu Thu Jun 28 09:30:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:30:43 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote: >> ... >> file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, it's just a tester. >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/trunk /mybiodir' to check out the main trunk for core. chris From hartzell at alerce.com Thu Jun 28 09:57:00 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:57:00 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.48684.996884.134046@almost.alerce.com> Sendu Bala writes: > [...] > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? So, you probably wanted something like svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk to pick up the head of the bioperl live tree (or /.../bioperl-run/trunk, etc...). I just checked out svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ and it ran to completion and gave me (delicious)[6:50am]~/tmp>>ls bioperl | cat biodata bioperl-cookbook bioperl-corba-client bioperl-corba-server bioperl-das-client bioperl-db bioperl-ext bioperl-gui bioperl-live bioperl-microarray bioperl-network bioperl-papers bioperl-pedigree bioperl-pipeline bioperl-run biosql-schema html task-manager xml-html Can another mac os x user out there give the Great Big Checkout a try and see if it runs to completion. Potential problems that come to mind are: - the "mac's are case insensitive, sort of" problem - you filled up your disk - something else. g. From charles-listes+bioperl at plessy.org Thu Jun 28 09:44:56 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 22:44:56 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <46837687.7010101@sheffield.ac.uk> References: <20070628074004.GD6338@kunpuu.plessy.org> <46837687.7010101@sheffield.ac.uk> Message-ID: <20070628134456.GB14492@kunpuu.plessy.org> Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit : > > Version 1.5.* is the developer release, while the 1.4.* is the stable > release. However, there have been few updates to the 1.4.* release which > means that it is more unstable than the 1.5.* dev release. I think the > consensus, was to have more rapid release cycles of the stable branch in > future in order to avoid this. I'm sure there are others more qualified > to expand/correct me on this if needs e. Ok, thank you all for the answers. I think that I will simply upgrade bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core when I will package other components. Have a nice day, -- Charles Plessy Debian-Med packaging team Wako, Saitama, Japan From bix at sendu.me.uk Thu Jun 28 10:19:49 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 15:19:49 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.48684.996884.134046@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> Message-ID: <4683C385.3050904@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: > > [...] > > I just tried: > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > > > on Mac OS X and things seemed to go well, except for this error message > > at the end: > > > > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > > svn: Can't move source to dest > > svn: Can't move > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > > to > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > > No such file or directory > > > > I also ended up with only: > > bioperl-corba-server bioperl-db bioperl-live > > bioperl-network bioperl-papers biosql-schema I tried again in the same location and it told me I had to 'svn cleanup', which I did. But subsequently it kept complaining about files already being there. > I just checked out > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ > > and it ran to completion [snip] > Can another mac os x user out there give the Great Big Checkout a try > and see if it runs to completion. Potential problems that come to > mind are: > > - the "mac's are case insensitive, sort of" problem > - you filled up your disk > - something else. Well, I didn't run out of disc space. After a rm -fr * and trying again it failed at exactly the same point, in the same way. svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data causes this repeatable problem: [...] A data/phredfile.phd svn: In directory 'data' svn: Can't move source to dest svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory That is with Mac OS X svn command-line client, version 1.4.4 I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with a linux svn command-line client, version 1.2.3. Cheers, Sendu. From dmessina at wustl.edu Thu Jun 28 11:08:59 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:08:59 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.44281.831316.749586@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: > [George] > Likewise, you probably DON'T want to use this in your config file: > > enable-auto-props = yes > * = svn:keywords="Author Date Id Rev URL" > > since it'll do the same thing. Ah, so I've been doing it wrong all along then. :) Thanks, George! > The Right Thing To Do is a more tedious > > *.pl = svn:keywords="Author Date Id Rev URL" > *.pm = svn:keywords="Author Date Id Rev URL" > *.c = svn:keywords="Author Date Id Rev URL" > > A bit of googling will give you a good starting point for the list, > and we should probably maintain a common one somewhere in the repo. I've googled around and gathered the following as a possible list for our repo. Since I obviously don't know what I'm doing :), of course adjust and refine as necessary. Dave ------- [auto-props] # Code formats *.c = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cpp = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.h = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.java = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.as = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cgi = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn-mine-type=text/plain *.js = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/javascript *.php = svn:eol-style=native; svn:keywords="Author Date Id Rev URL" Rev Date; svn:mime-type=text/x-php *.pl = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl; svn:executable *.pm = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl *.py = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-python; svn:executable *.sh = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-sh; svn:executable # Image formats *.bmp = svn:mime-type=image/bmp *.gif = svn:mime-type=image/gif *.ico = svn:mime-type=image/ico *.jpeg = svn:mime-type=image/jpeg *.jpg = svn:mime-type=image/jpeg *.png = svn:mime-type=image/png *.tif = svn:mime-type=image/tiff *.tiff = svn:mime-type=image/tiff # Data formats *.pdf = svn:mime-type=application/pdf *.avi = svn:mime-type=video/avi *.doc = svn:mime-type=application/msword *.eps = svn:mime-type=application/postscript *.gz = svn:mime-type=application/gzip *.mov = svn:mime-type=video/quicktime *.mp3 = svn:mime-type=audio/mpeg *.ppt = svn:mime-type=application/vnd.ms-powerpoint *.ps = svn:mime-type=application/postscript *.psd = svn:mime-type=application/photoshop *.rtf = svn:mime-type=text/rtf *.swf = svn:mime-type=application/x-shockwave-flash *.tgz = svn:mime-type=application/gzip *.wav = svn:mime-type=audio/wav *.xls = svn:mime-type=application/vnd.ms-excel *.zip = svn:mime-type=application/zip # Text formats .htaccess = svn:mime-type=text/plain *.css = svn:mime-type=text/css *.dtd = svn:mime-type=text/xml *.html = svn:mime-type=text/html *.ini = svn:mime-type=text/plain *.sql = svn:mime-type=text/x-sql *.txt = svn:mime-type=text/plain *.xhtml = svn:mime-type=text/xhtml+xml *.xml = svn:mime-type=text/xml *.xsd = svn:mime-type=text/xml *.xsl = svn:mime-type=text/xml *.xslt = svn:mime-type=text/xml *.xul = svn:mime-type=text/xul *.yml = svn:mime-type=text/plain CHANGES = svn:mime-type=text/plain COPYING = svn:mime-type=text/plain INSTALL = svn:mime-type=text/plain Makefile* = svn:mime-type=text/plain README = svn:mime-type=text/plain TODO = svn:mime-type=text/plain From dmessina at wustl.edu Thu Jun 28 11:11:23 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:11:23 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: > [Sendu] > > Yes. If you don't want everything, how does one 'browse' the > repository > to find out the address of the thing you /do/ want? svn ls file://dev.open-bio.org/home/hartzell/bioperl or svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl From n.haigh at sheffield.ac.uk Thu Jun 28 11:13:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:13:58 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: <4683D036.5060109@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > George Hartzell wrote: >> Sendu Bala writes: >>> I just tried: >>> >>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > [snip] >> It looks like you tried to check out the *entire* repository. > > Yes. If you don't want everything, how does one 'browse' the repository > to find out the address of the thing you /do/ want? > You could try: svn ls or svn ls -R to get a list of directories. > >> It never occured to me to try that. I'll take a look at what you >> reported. > > Cheers. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku akLhIszoQbRc/aVX3d/Jp7w= =mlHY -----END PGP SIGNATURE----- From cjfields at uiuc.edu Thu Jun 28 11:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:20:46 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> I can replicate the same problem (Mac OS X) with a full checkout: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory What local (mac) svn version are you using? I'm running off macports: svn --version svn, version 1.4.4 (r25188) compiled Jun 16 2007, 23:40:53 chris On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote: ... > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about > files > already being there. >> > [snip] >> Can another mac os x user out there give the Great Big Checkout a try >> and see if it runs to completion. Potential problems that come to >> mind are: >> >> - the "mac's are case insensitive, sort of" problem >> - you filled up your disk >> - something else. > > Well, I didn't run out of disc space. After a rm -fr * and trying > again > it failed at exactly the same point, in the same way. > > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ > release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or > directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine > with > a linux svn command-line client, version 1.2.3. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Jun 28 11:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:37:27 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Chris Fields wrote: >> ... > > The short and sweet version: my proposal has all the benefits of > yours, but none of the disadvantages. What's not to like? The short and sweet version: I'm more convinced after you laid out your argument in detail, which would have saved me some typing last night, BTW, thanks! ; > The other core devs need to chip in and we need to openly (candidly) discuss it some more (I've added Hilmar to this). There is also a tenable solution that allows both aspects ('cliques' and single mode) which might make everybody happy. Let's say we only want to install Bio::SeqIO::genbank. The Bio::SeqIO::genbank Build.PL would only install what was needed (as you indicated), only Bio::SeqIO::genbank-related tests would run (along with dependency test, if available), and life would go on. However, what if we wanted to install everything in SeqIO/DB/AlignIO/ etc? We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO modules installed or a select few (maybe a quick 'install all (y/n)?' followed by a list, which installs them one at a time along with dependencies), or have the option to specifically denote them as passed args to SeqIO's Build.PL, something like 'perl Build.PL - install-plugins genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a specific module (Bio::SeqIO::genbank) is installed directly then maybe the installation q&a's of followed modules could be bypassed when installing down the dependency tree with additional passed args. This would, in effect, be a bioperl-specific mini-CPAN within CPAN. Nice! Now, this doesn't address several related issues, such as how we handle versioning of the independent modules (should be in a controlled manner), what we do about deprecated modules which linger about on CPAN, how we deal with PPMs/RPMs/packaging, and so on. All have possible reasonable ways they can be addressed, I believe. Also, I think we should still think about doing regular full-scale 'stable' (1.#) releases (sort of our stamp of approval for that batch of modules at that point in time, with a reasonable 'sell-by' date). Again, it should be seriously discussed among the core devs and the bioperl community at large prior to any serious work on it, and it would be quite a large-scale project, but possibly worth it. It can only go forward if there is enough momentum behind it. >> Finally, all of this should wait until later. Much later, like >> after a decent release, after svn, etc kind of 'later'. I think >> we can agree on that. > > Hmm, not really. If it can be implemented by a change in just > Build.PL and ModuleBuildBioperl, its really independent of > everything else. That's the beauty of it: the only thing that > changes is how things are uploaded to and downloaded from CPAN. The > only person that normally deals with that issue is the pumpkin for > a release, and he only cares about it at release time. > > In fact, if we're going to do it at all it makes sense to try it > out on a minor release like 1.5.3. We've already got experience of > doing it split-style from 1.5.2. (And let me tell you: splits at > the code-base level suck.) BOSC is coming up, and I would like to focus on getting svn migration taken care of ASAP (which is sounding more and more like we plan on moving all open-bio over, unless I misread Jason's post?) and stomping of bugs (my next priority after EUtilities). Maybe in the interim we should try focusing on bug squashing, get out a quick standard dev release (1.5.3) before BOSC, and then a few of us could all communicate there via email/text/IM/phone off-list? Maybe post updates via the bioperl blog and list? > And where is the harm in letting them do it via CPAN as well? In > fact, there are significant benefits: ... I'm already pretty convinced... > The same can be achieved with CPAN bundles for each kind of > functional grouping you can think of. And since its just a single > text file that defines such a grouping, its easy to change or add > new ones as you feel like it, as opposed to the rather more > permanent and substantial effort of creating one of your splits on > the code-base level. ... or it could be run right in Module::Build for specific parent classes (as I mention above). Bundling could be instituted for something like a standard GBrowse release (Bundle::BioPerl::GBrowse) where the functionality might be more spread out (Bio::DB*, Bio::Graphics, Bio::FeatureIO, etc). For a full-scale old-style core install, another Bundle (Bundle::BioPerl::Standard). ... > Yes, it would be automated, and no, it wouldn't at all be any kind > of additional headache. I'm proposing a fully-automated system that > the pumpkin wouldn't even have to think about it. Much /less/ of a > headache than dealing with splits. Orders of magnitude easier to > deal with. The 'headache' would be the initial setup (splitting test, individual Build.PL, etc), but this could be done stepwise or section-wise, I suppose. ... > And the smallest, most concentrated set of modules is the > individual module. Well, only if it runs correctly (i.e. has the entire dep. tree installed). But the 'follow' tests would handle that. > The reason some of these existing splits (micoarray, ext) have > fallen by the way-side? /Because/ they're splits. If they had been > part of bioperl-live all along, they'd have been kept in a working, > compatible state and would have been released along with everything > else in 1.5.2 microarray fell out of favor for other reasons (much faster ways to do the same thing via R), though I think it still could be salvaged if someone wanted to take it up. the other bioperl distros (network, db, run, etc) would also necessitate following the same path as core, but I guess they could be bundled as well. > ... > No headaches. I already have one, sorry! chris From n.haigh at sheffield.ac.uk Thu Jun 28 11:53:52 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:53:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683D990.8090909@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> ... >> >> The short and sweet version: my proposal has all the benefits of >> yours, but none of the disadvantages. What's not to like? > > The short and sweet version: I'm more convinced after you laid out your > argument in detail, which would have saved me some typing last night, > BTW, thanks! ; > > > The other core devs need to chip in and we need to openly (candidly) > discuss it some more (I've added Hilmar to this). There is also a > tenable solution that allows both aspects ('cliques' and single mode) > which might make everybody happy. Couldn't "cliques" simply be satisfied with CPAN Bundles? > > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? I think this might be where Bundles come in for installing these "cliques" of related modules? - -- snip -- > >> Yes, it would be automated, and no, it wouldn't at all be any kind of >> additional headache. I'm proposing a fully-automated system that the >> pumpkin wouldn't even have to think about it. Much /less/ of a >> headache than dealing with splits. Orders of magnitude easier to deal >> with. > > The 'headache' would be the initial setup (splitting test, individual > Build.PL, etc), but this could be done stepwise or section-wise, I suppose. Yes, I think this is where most of the labour will be. However, setting the test suite up like this would be beneficial with or without publishing modules individually. - -- snip -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg UKE/Q/wA3gu1Gb7S6rarCQw= =WQdY -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 12:03:54 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 17:03:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683DBEA.90005@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? > > We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO > modules installed or a select few (maybe a quick 'install all (y/n)?' > followed by a list, which installs them one at a time along with > dependencies), or have the option to specifically denote them as passed > args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins > genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a > specific module (Bio::SeqIO::genbank) is installed directly then maybe > the installation q&a's of followed modules could be bypassed when > installing down the dependency tree with additional passed args. I'd probably stay away from something like this. My primary reason being, off-the-top-of-my-head I don't see how to get it to work. If you're installing Bio::SeqIO for the first time via CPAN you can't ask it to install Bio::SeqIO::genbank et al. at the same time because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity. I also wouldn't want these things to be complicated. There should be little in the way of questions to ask during install. Each module's Build.PL should be ultra-simple with no advanced logic at all. It should just specify things that are absolute requirements. This simplicity helps avoid some of the problems we face by distributing the monolithic Bioperl. No, much better for us and for users to provide a Bundle::Bio-SeqIO. > Now, this doesn't address several related issues, such as how we handle > versioning of the independent modules (should be in a controlled > manner), When a module is changed, it gets a version bump. Nothing complicated needs to be done. Transparent and obvious, behaving like all other CPAN modules would be my choice. > what we do about deprecated modules which linger about on CPAN, Delete them from CPAN seems appropriate. > how we deal with PPMs/RPMs/packaging, and so on. All have possible > reasonable ways they can be addressed, I believe. Also, I think we > should still think about doing regular full-scale 'stable' (1.#) > releases (sort of our stamp of approval for that batch of modules at > that point in time, with a reasonable 'sell-by' date). Yes, we can still choose to take a snapshot and announce it to the world, but at the module-level nothing special would happen. There would just be an updated Bundle::Bioperl-everything (or whatever). > Again, it should be seriously discussed among the core devs and the > bioperl community at large prior to any serious work on it, and it would > be quite a large-scale project, but possibly worth it. It can only go > forward if there is enough momentum behind it. The requirement for this approach is per-module test scripts. Which as I identified already, is very desirable anyway so we can hit 100% test coverage. So, regardless of anything else can we all agree that per-module test scripts are a good idea and should be worked on? If so, I'll look into the feasibility and figure out how much work will be involved. From cjfields at uiuc.edu Thu Jun 28 13:17:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 12:17:50 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683DBEA.90005@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > ... > I'd probably stay away from something like this. My primary reason > being, off-the-top-of-my-head I don't see how to get it to work. If > you're installing Bio::SeqIO for the first time via CPAN you can't > ask it to install Bio::SeqIO::genbank et al. at the same time > because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some > circularity. True... > I also wouldn't want these things to be complicated. There should > be little in the way of questions to ask during install. Each > module's Build.PL should be ultra-simple with no advanced logic at > all. It should just specify things that are absolute requirements. > This simplicity helps avoid some of the problems we face by > distributing the monolithic Bioperl. > > No, much better for us and for users to provide a Bundle::Bio-SeqIO. I just don't want too much Bundle-itis as it'll gets confusing for newbie (i.e. Vista-itis, or AdobeCS-itis). It should be limited to functional grouping (SeqIO, AlignIO, DB, etc), 'install everything', or distribution-specific (GBrowse). I also think (though Hilmar may veto this) that we should work on integrating bioperl-db, network, etc. into this if it goes forward. Here's a question: how do we plan on handling uploading bioperl updates to CPAN via PAUSE? Do we want to run every single module through one pumpkin? Or do we want to have a core dev group PAUSE account? I can see, for instance, removing everything EUtilities- related and submitting it independently using my own PAUSE account, but it would be nice to have it under an umbrella 'bioperl-devs' account instead. >> Now, this doesn't address several related issues, such as how we >> handle versioning of the independent modules (should be in a >> controlled manner), > > When a module is changed, it gets a version bump. Nothing > complicated needs to be done. Transparent and obvious, behaving > like all other CPAN modules would be my choice. > >> what we do about deprecated modules which linger about on CPAN, > > Delete them from CPAN seems appropriate. I know you can do that via PAUSE, but I think it lingers about on search.cpan.org (unless that's been fixed). This would prob. have to be used sparingly. >> how we deal with PPMs/RPMs/packaging, and so on. All have >> possible reasonable ways they can be addressed, I believe. Also, >> I think we should still think about doing regular full-scale >> 'stable' (1.#) releases (sort of our stamp of approval for that >> batch of modules at that point in time, with a reasonable 'sell- >> by' date). > > Yes, we can still choose to take a snapshot and announce it to the > world, but at the module-level nothing special would happen. There > would just be an updated Bundle::Bioperl-everything (or whatever). Right, it would basically be a stamp of certification. >> Again, it should be seriously discussed among the core devs and >> the bioperl community at large prior to any serious work on it, >> and it would be quite a large-scale project, but possibly worth >> it. It can only go forward if there is enough momentum behind it. > > The requirement for this approach is per-module test scripts. Which > as I identified already, is very desirable anyway so we can hit > 100% test coverage. > > So, regardless of anything else can we all agree that per-module > test scripts are a good idea and should be worked on? If so, I'll > look into the feasibility and figure out how much work will be > involved. I think so, but the feasibility issue is critical. Do we want cvs/ svn to be divided up into 900 subdirectories (one for each module), or do we want to have a similar directory structure as we have now, but with each module in it's own directory? Or leave everything as is and generate Build.PL on-the-fly (prob. least feasible)? This is where it might be wise to do it piece-meal at first (maybe starting with something somewhat segregated like Bio::Tools), then progress from there. chris From hartzell at alerce.com Thu Jun 28 13:38:48 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 13:38:48 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: <18051.61992.627473.323346@almost.alerce.com> David Messina writes: > > [George] > > Likewise, you probably DON'T want to use this in your config file: > > > > enable-auto-props = yes > > * = svn:keywords="Author Date Id Rev URL" > > > > since it'll do the same thing. > > Ah, so I've been doing it wrong all along then. :) Thanks, George! It's not *wrong* if it's never done anything to you that you've regretted. The right answer depends on your situation.... > [...] > I've googled around and gathered the following as a possible list for > our repo. Since I obviously don't know what I'm doing :), of course > adjust and refine as necessary. > That's a great starting point. Do you have write access to the wiki? Could you link it off of the instructions for using svn? g. From hartzell at alerce.com Thu Jun 28 14:06:50 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 14:06:50 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <18051.63674.685297.426813@almost.alerce.com> Sendu Bala writes: > [...] > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about files > already being there. You need to do the cleanup because svn exited gracelessly and you needed to help it get back in it's feet. The cleanup doesn't remove the stuff that you did get checked out, so it's still there getting in the way of your new checkout. > [...] > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with > a linux svn command-line client, version 1.2.3. I'm not 100% sure what's going on here, but I'm inclined to say "get a real computer" (and yes, I'm typing this on a mac...). I have a mac pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony the tiger used to say).... I think that we're having trouble with case sensitivity. My only evidence is that I can see where there have been both HUMBETGLOA.FASTA and HUMBETGLOA.fasta in the tree at various times. I can't figure out anything else that's weird about that file. On the other hand, I can't see how this would cause the error you're seeing though. The experiment would be to grab a usb or firewire disk (or even a memory stick), partition/format it as case sensitive (or even *unix*) and try to do svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data into it. If it works, voila. If not, I'll keep making stuff up, err, thinking about it. g. From dmessina at wustl.edu Thu Jun 28 14:15:32 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:15:32 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu> Same svn error here on the full checkout. > What local (mac) svn version are you using? I'm running off macports: > > svn --version > svn, version 1.4.4 (r25188) > compiled Jun 16 2007, 23:40:53 I have svn 1.4.3. % svn --version svn, version 1.4.3 (r23084) compiled Apr 1 2007, 02:47:14 Copyright (C) 2000-2006 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ This product includes software developed by CollabNet (http:// www.Collab.Net/). The following repository access (RA) modules are available: * ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' scheme * ra_svn : Module for accessing a repository using the svn network protocol. - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme From cjfields at uiuc.edu Thu Jun 28 14:54:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 13:54:15 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.63674.685297.426813@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > ... > I'm not 100% sure what's going on here, but I'm inclined to say "get a > real computer" (and yes, I'm typing this on a mac...). I have a mac > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > the tiger used to say).... Ouch! Though it could be worse (**coughwindowscough**). > I think that we're having trouble with case sensitivity. My only > evidence is that I can see where there have been both HUMBETGLOA.FASTA > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > anything else that's weird about that file. On the other hand, I > can't see how this would cause the error you're seeing though. Odd that other branches (including the main trunk) work but that one doesn't. > The experiment would be to grab a usb or firewire disk (or even a > memory stick), partition/format it as case sensitive (or even *unix*) > and try to do > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > > into it. If it works, voila. If not, I'll keep making stuff up, err, > thinking about it. > > g. I'll have to figure out why I can't get ssh keys to work locally to test it out more (I have a usb drive to test with); just don't have time at the moment. chris From dmessina at wustl.edu Thu Jun 28 14:47:04 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:47:04 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu> > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? Done. http://www.bioperl.org/wiki/Svn_auto-props linked from: http://www.bioperl.org/wiki/Using_Subversion (bottom of page) From bix at sendu.me.uk Thu Jun 28 15:19:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 20:19:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> Message-ID: <468409C7.7020102@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > Here's a question: how do we plan on handling uploading bioperl > updates to CPAN via PAUSE? Do we want to run every single module > through one pumpkin? Or do we want to have a core dev group PAUSE > account? I can see, for instance, removing everything EUtilities- > related and submitting it independently using my own PAUSE account, > but it would be nice to have it under an umbrella 'bioperl-devs' > account instead. All Bioperl modules (except the Bundle!) are owned by BIOPERLML on PAUSE. Its a little akward since PAUSE is uploader-centric, but see my notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release And certainly, everything that wants to consider itself part of Bioperl (and gain the benefit of lots of devs looking after it) should certainly have BIOPERLML as the primary owner. > I think so, but the feasibility issue is critical. Do we want cvs/ > svn to be divided up into 900 subdirectories (one for each module), > or do we want to have a similar directory structure as we have now, > but with each module in it's own directory? Or leave everything as > is and generate Build.PL on-the-fly (prob. least feasible)? Very definitely the latter. The key benefit of my approach is that the organisation stays as is and that a snapshot of the repository remains a single directory of modules in Bio so that people don't have to 'install' Bioperl, they can still just uncompress the archive (or check out the package from svn) and point their PERL5LIB to the root dir of the package. For that reason I very much like the idea of folding the current split-out packages (run, network etc.) back into the core package so everything is one place. Folding them back in should obviously wait until everything is in place and working with core already. My proposal obviously wasn't very clear. As far as all other devs are concerned, nothing changes at all (except for lots of new improved test scripts). The pumpkin will, however, be able to say: ./Build dist Right now that generates the distribution archives (in different compression formats) - one big archive containing everything. My proposal is simply that instead it generates lots of archives, one archive per module. It will also generate some Bundles and whatever else might be needed. I don't envisage any major difficulties in achieving this. The 'feasibility' issue I was going to look into was strictly regarding doing all the new test scripts. From hartzell at alerce.com Thu Jun 28 15:43:38 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 15:43:38 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: <18052.3946.224905.415905@almost.alerce.com> Chris Fields writes: > > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > > > ... > > I'm not 100% sure what's going on here, but I'm inclined to say "get a > > real computer" (and yes, I'm typing this on a mac...). I have a mac > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > > the tiger used to say).... > > Ouch! Though it could be worse (**coughwindowscough**). > > > I think that we're having trouble with case sensitivity. My only > > evidence is that I can see where there have been both HUMBETGLOA.FASTA > > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > > anything else that's weird about that file. On the other hand, I > > can't see how this would cause the error you're seeing though. > > Odd that other branches (including the main trunk) work but that one > doesn't. > > > The experiment would be to grab a usb or firewire disk (or even a > > memory stick), partition/format it as case sensitive (or even *unix*) > > and try to do > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > > live/tags/release-0-9-2/t/data > > > > into it. If it works, voila. If not, I'll keep making stuff up, err, > > thinking about it. > > > > g. > > I'll have to figure out why I can't get ssh keys to work locally to > test it out more (I have a usb drive to test with); just don't have > time at the moment. I just did the experiment, and filename-insensitivity seems to be breaking something. I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. I reformatted a memory stick to be case sensitive and co of bioperl/bioperl-live/tags/release-0-9-2/t worked, then I made a directory in my home dir (normal mac thing) and got the same error as above. I can get a copy of the trunk, so I'm inclined to ask someone to mention the problem on the wiki and then just ignore it. g. From cjfields at uiuc.edu Thu Jun 28 16:29:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 15:29:09 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu> On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: >> Here's a question: how do we plan on handling uploading bioperl >> updates to CPAN via PAUSE? Do we want to run every single module >> through one pumpkin? Or do we want to have a core dev group PAUSE >> account? I can see, for instance, removing everything EUtilities- >> related and submitting it independently using my own PAUSE account, >> but it would be nice to have it under an umbrella 'bioperl-devs' >> account instead. > > All Bioperl modules (except the Bundle!) are owned by BIOPERLML on > PAUSE. Its a little akward since PAUSE is uploader-centric, but see my > notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release > > And certainly, everything that wants to consider itself part of > Bioperl > (and gain the benefit of lots of devs looking after it) should > certainly > have BIOPERLML as the primary owner. Alrighty then. >> I think so, but the feasibility issue is critical. Do we want cvs/ >> svn to be divided up into 900 subdirectories (one for each module), >> or do we want to have a similar directory structure as we have now, >> but with each module in it's own directory? Or leave everything as >> is and generate Build.PL on-the-fly (prob. least feasible)? > > Very definitely the latter. The key benefit of my approach is that the > organisation stays as is and that a snapshot of the repository > remains a > single directory of modules in Bio so that people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Okay, makes sense. > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. I agree, but that's up to Brian, Hilmar, and the others who donated the packages (or at least a consensus of core devs). One thing at a time. > My proposal obviously wasn't very clear. As far as all other devs are > concerned, nothing changes at all (except for lots of new improved > test > scripts). The pumpkin will, however, be able to say: > > ./Build dist > > Right now that generates the distribution archives (in different > compression formats) - one big archive containing everything. > My proposal is simply that instead it generates lots of archives, one > archive per module. It will also generate some Bundles and whatever > else > might be needed. We'll need to define which tests and data goes with each module and so on. > I don't envisage any major difficulties in achieving this. The > 'feasibility' issue I was going to look into was strictly regarding > doing all the new test scripts. Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 is ready to go. We'll still need to get thoughts on this from other core devs out there, and it prob. should until everybody is comfortable with the idea. chris From dmessina at wustl.edu Thu Jun 28 18:13:48 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 17:13:48 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: Coming late to this party, I'm replying to snippets from multiple emails. > [Chris] > what we do about deprecated modules which linger > about on CPAN > [Sendu] > Delete them from CPAN seems appropriate. I coulda sworn this was frowned upon, but a recent thread suggests it's totally kosher. http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html > [Sendu] > So, regardless of anything else can we all agree that per-module test > scripts are a good idea and should be worked on? I agree. > [Sendu] > people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Could you elaborate a bit on how this works? How is XS code that needs compiling handled? Or the scripts directory? I would love to be able to do this. > [Sendu] > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. From an organizational standpoint, I'm concerned that with ~900 modules in core right now, adding all of the additional stuff from the split-out packages would make for a daunting directory. But as you said, this is way down the road, so this proposal doesn't bear on the other, closer-to-now issues on the table. > [Chris] > Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 > is ready to go. We'll still need to get thoughts on this from other > core devs out there, and it prob. should until everybody is > comfortable with the idea. If we go forward with the CPAN split plan, I like the idea of having a trial. We can foresee some of the issues that such a change may bring, and yet still more no doubt wait for us once we do it. Dave From bix at sendu.me.uk Thu Jun 28 18:59:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 23:59:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46843D57.2080409@sendu.me.uk> David Messina wrote: >> people don't have to 'install' Bioperl, they can still just >> uncompress the archive (or check out the package from svn) and >> point their PERL5LIB to the root dir of the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to be > able to do this. I meant for the most part. Core doesn't have any XS code so that's not an issue. Scripts can be run manually like any other perl script. When you discover something isn't working because of a missing external dependency, you just install it. (But that happens very rarely.) Personally I've /never/ installed Bioperl and used that installed set of modules. I've always just pointed my PERL5LIB at the distribution folder or my cvs checkout. Which makes me a strange candidate for advocating all these CPAN-specific changes, but there you go ;) From cjfields at uiuc.edu Thu Jun 28 19:03:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 18:03:02 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu> On Jun 28, 2007, at 5:13 PM, David Messina wrote: > Coming late to this party, I'm replying to snippets from multiple > emails. > > >> [Chris] >> what we do about deprecated modules which linger >> about on CPAN > >> [Sendu] >> Delete them from CPAN seems appropriate. > > I coulda sworn this was frowned upon, but a recent thread suggests > it's totally kosher. > > http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html As long as it doesn't show up somewhere to confuse newbies I'm okay with it. >> [Sendu] >> people don't have to >> 'install' Bioperl, they can still just uncompress the archive (or >> check >> out the package from svn) and point their PERL5LIB to the root dir of >> the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to > be able to do this. Maybe Sendu can add to this, but the XS code is limited to bioperl- ext AFAIK. We could keep that separate until it plays well with bioperl itself. Scripts and examples - maybe packaged along with a Bundle? >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal > doesn't bear on the other, closer-to-now issues on the table. Well, the code in bioperl-db and network complement code in core, so I agree with Sendu they belong there. They should be under the same scrutiny as the rest anyway (code, tests, etc), but won't be bundled unles there is an 'install everything' Bundle. >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of > having a trial. We can foresee some of the issues that such a > change may bring, and yet still more no doubt wait for us once we > do it. That's what branches are for; testing stuff out like this. chris From hartzell at alerce.com Thu Jun 28 19:05:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 19:05:32 -0400 Subject: [Bioperl-l] problem with binary files. Message-ID: <18052.16060.932502.183552@almost.alerce.com> Ok, after pointing out the problem with setting the svn:keywords property on binary files, it turns out that I *did* that. Worse yet, I set the svn:eol-style to 'native' on everything, including binary files, so depending on your platform they're likely to be fubar. For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may not be what you expect it to be, depending on whether your eol-style matches the servers and whether any conversions were done. I'll touch up the way that the little tool I'm using calls cvs2svn and redo the repository. g. From n.haigh at sheffield.ac.uk Fri Jun 29 02:59:21 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 07:59:21 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4684ADC9.8040404@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- split -- >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal doesn't > bear on the other, closer-to-now issues on the table. > I don't think this is an issue - it would simply mean everything is under the same version control hierarchy. And with svn it's Soooooo much easier to fiddle around with directory structures > > >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of having > a trial. We can foresee some of the issues that such a change may > bring, and yet still more no doubt wait for us once we do it. > Under svn it would be easy to make an "svn copy" of run, network etc into a branch of live to test this out. Not that this might be a problem, but: Since we are looking at bioperl-* packages being under the same svn repository, then then "svn copy's" are cheap for disk space. > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6 BCvltmPyWF4ImueYmd7VFAc= =ktl+ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Fri Jun 29 03:05:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 08:05:33 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <4684AF3D.5090907@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: - -- snip -- > > [...] > > I've googled around and gathered the following as a possible list for > > our repo. Since I obviously don't know what I'm doing :), of course > > adjust and refine as necessary. > > > > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? > > g. Don't .t files need adding to the auto-props? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC /Iivb6Lc4/51bUdrTmRQYlE= =V+t2 -----END PGP SIGNATURE----- From sac at bioperl.org Fri Jun 29 04:25:36 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 01:25:36 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> On 6/27/07, Chris Fields wrote: > > On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > > > ... > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > > > or > > > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around with it. Are you using the ssh that comes installed with OSX? If so, I'd recommend installing openssh from MacPorts. I recall having issues with the stock version which were resolved by using the more up-to-date version you can get via MacPorts. BTW, I haven't been able to check out the new svn repository via svn+ssh:// because I can't get svn to authenticate with an alternative username. My username on dev.open-bio.org differs from what it is on my local machine, so I issue a command such as: steve at localhost $ svn --username sac checkout svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk but I get challenged with: steve at dev.open-bio.org's password: I also tried putting the --username argument after the subcommand, but it still wants to use my local username. I can ssh -l sac into the dev box no problem. Any suggestions? Steve From bix at sendu.me.uk Fri Jun 29 04:52:42 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 29 Jun 2007 09:52:42 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <4684C85A.5030206@sendu.me.uk> Steve Chervitz wrote: > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? Set up your ssh key on the dev machine. I'm also on a machine with the wrong username and it works even without attempting to supply the correct one. It does, however, show the 'Welcome to the new developer system' message 2 or 3 times for every svn+ssh action, which freaks me out a little. From N.Haigh at sheffield.ac.uk Fri Jun 29 05:32:38 2007 From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 10:32:38 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Quoting Steve Chervitz : -- snip -- > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > You could try: svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Nath From dmessina at wustl.edu Fri Jun 29 08:28:26 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 07:28:26 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> > > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. I have the same issue. I set up a stanza in my ~/.ssh/config: Host dev.open-bio.org User dave_messina where dave_messina is my dev.open-bio.org username. From cjfields at uiuc.edu Fri Jun 29 13:00:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 29 Jun 2007 12:00:27 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >> BTW, I haven't been able to check out the new svn repository via >> svn+ssh:// because I can't get svn to authenticate with an >> alternative >> username. > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > Host dev.open-bio.org > User dave_messina > > where dave_messina is my dev.open-bio.org username. I changed to the macports ssh w/o luck. It appears the key is offered up, so maybe the problem is how I have everything set up on dev (though I followed everything on the wiki): .... Contact 'support at open-bio.org' for your new login information. ====================================== debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug1: Next authentication method: publickey debug1: Offering public key: /Users/cjfields/.ssh/id_dsa debug2: we sent a publickey packet, wait for reply debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug2: we did not send a packet, disable method debug1: Next authentication method: password It's odd; I can use passwordless logins for other servers (admittedly Mac servers) w/o problems using ssh keys, but dev.open-bio.org always prompts for a password regardless. My feeling is it's something with my local ssh or sshd config; I'll try fiddling with it to see what happens. Anyone have suggestions? I've lost enough hair as is; don't want to lose more! chris From sac at bioperl.org Fri Jun 29 13:07:45 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 10:07:45 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com> On 6/29/07, Nathan S. Haigh wrote: > Quoting Steve Chervitz : > > -- snip -- > > > BTW, I haven't been able to check out the new svn repository via > > svn+ssh:// because I can't get svn to authenticate with an alternative > > username. My username on dev.open-bio.org differs from what it is on > > my local machine, so I issue a command such as: > > > > steve at localhost $ svn --username sac checkout > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > > > but I get challenged with: > > steve at dev.open-bio.org's password: > > > > I also tried putting the --username argument after the subcommand, but > > it still wants to use my local username. I can ssh -l sac into the dev > > box no problem. Any suggestions? > > [...] > You could try: > svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Bingo. Thanks for the tips, guys. BTW, setting up ssh keys was not the issue, since my key is already set up on the dev machine. The svn --username setting appears to not be operative at the ssh layer. I suspected this might be the case given that the usage info says: $ svn --help co --username arg : specify a username ARG --password arg : specify a password ARG which seemed insecure. I didn't want to send my password in the clear, and didn't know if or whether svn would hand it off to ssh. It wasn't even sending my username to ssh, so I knew something was wrong. These args are probably only intended for accessing local svn repositories, or non-svn+ssh-based checkouts. BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and openssh installed via MacPorts: $ svn --version svn, version 1.4.4 (r25188) compiled Jun 28 2007, 23:51:53 $ ssh -version OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007 Steve From hartzell at alerce.com Fri Jun 29 15:19:31 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 29 Jun 2007 15:19:31 -0400 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: <18053.23363.102371.602742@almost.alerce.com> Chris Fields writes: > > On Jun 29, 2007, at 7:28 AM, David Messina wrote: > > >> > >> BTW, I haven't been able to check out the new svn repository via > >> svn+ssh:// because I can't get svn to authenticate with an > >> alternative > >> username. > > > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > > > Host dev.open-bio.org > > User dave_messina > > > > where dave_messina is my dev.open-bio.org username. > > I changed to the macports ssh w/o luck. It appears the key is > offered up, so maybe the problem is how I have everything set up on > dev (though I followed everything on the wiki): A couple of things to check. - make sure that you put your public key in ~/.ssh/authorized_keys2 (not authorized_keys) - make sure that authorized_keys2 is chmod'ed 600 (644 might be enough...). - make sure that ~/.ssh is chmoded 700. - make sure that your home directory is 755. Then see if it works. You might be able to relax some of those protections a bit, but ssh's uptight about letting other people mess with that data. g. From dmessina at wustl.edu Fri Jun 29 18:47:14 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 17:47:14 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> > [Nathan] > Don't .t files need adding to the auto-props? Yes -- thanks for reminding me. Please feel free to add it to the wiki page. I'll be tweaking it some more later on in any case. Dave From n.haigh at sheffield.ac.uk Sat Jun 30 05:55:56 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 10:55:56 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> Message-ID: <468628AC.9060200@sheffield.ac.uk> David Messina wrote: >> [Nathan] >> Don't .t files need adding to the auto-props? > > Yes -- thanks for reminding me. Please feel free to add it to the wiki > page. I'll be tweaking it some more later on in any case. > > > Dave I noticed this has already been done. I have just been through the t/data dir and added a list of extensions I found (without props). There are some files without extensions, how should these be dealt with? There seems to be a plethora of file naming styles which means there's a pretty long list of non-standard extensions. So at some point someone will commit a new data file with a new extension (often describing what program created the output or the test for which it's intended) that won't be in the auto-props file - can you think of a way around this? Nath From cjfields at uiuc.edu Sat Jun 30 08:48:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 07:48:10 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <18053.23363.102371.602742@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> <18053.23363.102371.602742@almost.alerce.com> Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu> On Jun 29, 2007, at 2:19 PM, George Hartzell wrote: > Chris Fields writes: >> >> On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >>>> >>>> BTW, I haven't been able to check out the new svn repository via >>>> svn+ssh:// because I can't get svn to authenticate with an >>>> alternative >>>> username. >>> >>> I have the same issue. I set up a stanza in my ~/.ssh/config: >>> >>> Host dev.open-bio.org >>> User dave_messina >>> >>> where dave_messina is my dev.open-bio.org username. >> >> I changed to the macports ssh w/o luck. It appears the key is >> offered up, so maybe the problem is how I have everything set up on >> dev (though I followed everything on the wiki): > > A couple of things to check. > > - make sure that you put your public key in ~/.ssh/authorized_keys2 > (not authorized_keys) > > - make sure that authorized_keys2 is chmod'ed 600 (644 might be > enough...). > > - make sure that ~/.ssh is chmoded 700. > > - make sure that your home directory is 755. > > Then see if it works. You might be able to relax some of those > protections a bit, but ssh's uptight about letting other people mess > with that data. > > g. Got it working; it was the permissions on my home dir (the last one). Thanks George! chris From dmessina at wustl.edu Sat Jun 30 11:37:44 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 10:37:44 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <468628AC.9060200@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> > I have just been through the t/data dir and added a list of > extensions I found Thanks! That's a big help. I'll add prop definitions to those shortly. > There are some files without extensions, how should these be dealt > with? If you look in the text files section, there are some files there which don't have extensions, e.g. AUTHORS, BUGS. There's also Makefile.* so we have some flexibility in how svn knows to auto-prop a file. I haven't read up on the details yet to find out how it handles files that match multiple criteria -- it may be dependent simply on the order they're defined. > There seems to be a plethora of file naming styles which means > there's a pretty long list of non-standard extensions. So at some > point someone will commit a new data file with a new extension > (often describing what program created the output or the test for > which it's intended) that won't be in the auto-props file - can you > think of a way around this? Ive been thinking about this a bit. How about this? - We have just "standard" files and extensions (like *.blast, *.fasta) in the auto-props list. - We manually add props for the files that have nonstandard, arbitrary extensions so all the files have now are prop'd. - At some point we rename those nonstandard files to have standard extensions. Especially for the t/data/ files, we'll have to make sure to update the tests that rely on them. - We can have the suggested list of extensions for new files that get added. I don't think we need to strictly enforce this just for the sake of svn (after all, its primary function of version control will work just fine without any properties set), but it would be nice if we could try to keep to it mostly. Many distros come with an /etc/mime.types file which has the list of officially registered MIME types. I found a script that will take this list and convert it into auto-props format. I don't think we need to support *all* of the gazillion filetypes since most of the them our repository will never see, but we certainly could. Dave From dmessina at wustl.edu Sat Jun 30 12:26:27 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 11:26:27 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 10:37 AM, David Messina wrote: > - We manually add props for the files that have nonstandard, > arbitrary extensions so all the files have now are prop'd. Er, that should be - We manually add props for the files that have nonstandard, arbitrary extensions so that all the files now in the repository are prop'd. From n.haigh at sheffield.ac.uk Sat Jun 30 13:25:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 18:25:58 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: <46869226.70203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- snip -- > > >> There seems to be a plethora of file naming styles which means there's >> a pretty long list of non-standard extensions. So at some point >> someone will commit a new data file with a new extension (often >> describing what program created the output or the test for which it's >> intended) that won't be in the auto-props file - can you think of a >> way around this? > > Ive been thinking about this a bit. How about this? > > - We have just "standard" files and extensions (like *.blast, *.fasta) > in the auto-props list. I think the list of seq formats recognised by Bioperl in Bio::SeqIO and Bio::AlignIO would be a good start. As these are likely to be the ones that are sensitive to file format recognition and thus could break tests if renamed. I think a lot of people have used "." in file names as an alternative to a space. I think it would be beneficial to use an underscore "_" in these cases and leave the "." to represent the beginning of the file extension. > > - We manually add props for the files that have nonstandard, arbitrary > extensions so all the files that we currently have now are prop'd. > > - At some point we rename those nonstandard files to have standard > extensions. Especially for the t/data/ files, we'll have to make sure to > update the tests that rely on them. Nice and easy with svn :) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0 pYVvXwxq0lpiGfM09RQ6A1I= =3Lhw -----END PGP SIGNATURE----- From cjfields at uiuc.edu Sat Jun 30 15:11:52 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 14:11:52 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 11:26 AM, David Messina wrote: > > On Jun 30, 2007, at 10:37 AM, David Messina wrote: > >> - We manually add props for the files that have nonstandard, >> arbitrary extensions so all the files have now are prop'd. > > Er, that should be > > - We manually add props for the files that have nonstandard, > arbitrary extensions so that all the files now in the repository are > prop'd. Do we need to define every filetype extension, or can there be a fallback (eg if it isn't on the list or has no extension it's plain text)? chris From hlapp at gmx.net Sat Jun 30 17:26:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 17:26:22 -0400 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > [...] > Very definitely the latter. The key benefit of my approach is that > the organisation stays as is and that a snapshot of the repository > remains a single directory of modules in Bio so that people don't > have to 'install' Bioperl, they can still just uncompress the > archive (or check out the package from svn) and point their > PERL5LIB to the root dir of the package. I think this is absolutely key to keep in mind. Anything without this feature will likely be a non-starter. I don't really have time to follow the discussion let alone participate, so really all I can contribute is to offer some sanity/ reality checks (such as the above). In this sense, I understand a release pumpkin will generate ~900 packages to upload to CPAN? How much hassle is that compared to what uploading a bioperl release means right now? How brittle is all the Build.PL code that will be needed to automate all of this, and how difficult will it be to maintain? For example, if someone adds in 10 new modules, what Build.PL-related work will need to be done? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Sat Jun 30 17:32:52 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 30 Jun 2007 22:32:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <4686CC04.6000403@sendu.me.uk> Hilmar Lapp wrote: > On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > >> [...] >> Very definitely the latter. The key benefit of my approach is that >> the organisation stays as is and that a snapshot of the repository >> remains a single directory of modules in Bio so that people don't >> have to 'install' Bioperl, they can still just uncompress the >> archive (or check out the package from svn) and point their >> PERL5LIB to the root dir of the package. [snip] > In this sense, I understand a release pumpkin will generate ~900 > packages to upload to CPAN? How much hassle is that compared to what > uploading a bioperl release means right now? I'd have to investigate. I did my uploads using the PAUSE website, which for 900 packages would be unfeasible. Will have to see if the process can be automated. > How brittle is all the Build.PL code that will be needed to automate > all of this, and how difficult will it be to maintain? For example, > if someone adds in 10 new modules, what Build.PL-related work will > need to be done? Well, my plan will be that once the work is done, you won't need to touch the Build.PL code again. My intent is that the pumpkin can just type one command and not think about anything. As for the reality, I won't know until I think about it properly and experiment. From hlapp at gmx.net Sat Jun 30 19:36:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 19:36:45 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18052.3946.224905.415905@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > I just did the experiment, and filename-insensitivity seems to be > breaking something. > > I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. > > I reformatted a memory stick to be case sensitive and co of > > bioperl/bioperl-live/tags/release-0-9-2/t > > worked, then I made a directory in my home dir (normal mac thing) and > got the same error as above. You picked up a rename of a file from lower case extension to upper case extension. Unfortunately, there are several months between adding the upper-case and removing the lower-case version. We can reconstruct what happened with this using svn log on the directory (this does not require a checkout): $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ bioperl-live/trunk/t/data Searching for HUMBETGLOA yields the following two commits that added one and removed the other: ------------------------------------------------------------------------ r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines Changed paths: M /bioperl-live/trunk/t/SearchIO.t A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA A /bioperl-live/trunk/t/data/cysprot1.FASTA added tests for FASTA ------------------------------------------------------------------------ r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines Changed paths: A /bioperl-live/trunk/t/data/HUMBETGLOA.fa D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta renaming file to avoid clobbering on windows Unfortunately, both files are in the tag (again, no checkout required): $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta HUMBETGLOA.FASTA HUMBETGLOA.fasta We can remove the offending version from the repository (again, without needing a checkout): $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta I did this, and now the tag checks out fine on OSX. Can anyone confirm? (BTW the ability to operate on the repository w/o needing a checkout is another advantage of svn) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 30 20:40:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 19:40:53 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: Checkout worked for me (Mac OS X) using both: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ tags/release-0-9-2/t/data svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ tags/release-0-9-2/ so removing the offending file worked (good catch!). Haven't run a full co but probably isn't necessary. chris On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > >> I just did the experiment, and filename-insensitivity seems to be >> breaking something. >> >> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. >> >> I reformatted a memory stick to be case sensitive and co of >> >> bioperl/bioperl-live/tags/release-0-9-2/t >> >> worked, then I made a directory in my home dir (normal mac thing) and >> got the same error as above. > > You picked up a rename of a file from lower case extension to upper > case extension. Unfortunately, there are several months between > adding the upper-case and removing the lower-case version. > > We can reconstruct what happened with this using svn log on the > directory (this does not require a checkout): > > $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ > bioperl/bioperl-live/trunk/t/data > > Searching for HUMBETGLOA yields the following two commits that > added one and removed the other: > > ---------------------------------------------------------------------- > -- > r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines > Changed paths: > M /bioperl-live/trunk/t/SearchIO.t > A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA > A /bioperl-live/trunk/t/data/cysprot1.FASTA > > added tests for FASTA > > ---------------------------------------------------------------------- > -- > r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines > Changed paths: > A /bioperl-live/trunk/t/data/HUMBETGLOA.fa > D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta > > renaming file to avoid clobbering on windows > > Unfortunately, both files are in the tag (again, no checkout > required): > > $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta > HUMBETGLOA.FASTA > HUMBETGLOA.fasta > > We can remove the offending version from the repository (again, > without needing a checkout): > > $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta > > I did this, and now the tag checks out fine on OSX. Can anyone > confirm? > > (BTW the ability to operate on the repository w/o needing a > checkout is another advantage of svn) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 30 20:48:06 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 30 Jun 2007 17:48:06 -0700 Subject: [Bioperl-l] Take 2 of the new subversion repository. Message-ID: <18054.63942.316904.413911@almost.alerce.com> There's a second cut at the subversion repository. I've done a better job of setting svn:keywords and svn:eol-style on various files. The defaults were more cautious and I used an auto-props files based on the wiki version. svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 The old repository's still around as svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 I renamed it so that people would work with it by mistake. If, for some hard-to-imagine reason, you have a working copy that you want to run against it, you should be able to do an svn switch --relocate on your working copy and be back in shape. In fact, it might be a good time to give it a try.... g. From hartzell at alerce.com Sat Jun 30 21:17:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 30 Jun 2007 18:17:18 -0700 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: <18055.158.30409.808612@almost.alerce.com> Chris Fields writes: > Checkout worked for me (Mac OS X) using both: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ > tags/release-0-9-2/t/data > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ > tags/release-0-9-2/ > > so removing the offending file worked (good catch!). Haven't run a > full co but probably isn't necessary. > [...] I'll keep a note of that as something to do when I prepare the final cut of the repository. g. From jason at bioperl.org Sat Jun 30 21:25:30 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 30 Jun 2007 18:25:30 -0700 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: <18054.63942.316904.413911@almost.alerce.com> References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: Thanks George - I also did chgrp -R bioperl /home/hartzell/bioperl_take? to make sure the group permission was set right. We may also want to do a chmod g+s on all the dirs in there as well so that permissions are preserved when this gets deployed for real. If anyone wants to make some changes to files and commit them, as well as make some branches/tags to play around a little bit since we'll likely throw this away and do it again from locked down version from CVS at some appointed time. Do you know how to have svn commit messages generate summary emails as well? -j On Jun 30, 2007, at 5:48 PM, George Hartzell wrote: > > There's a second cut at the subversion repository. I've done a better > job of setting svn:keywords and svn:eol-style on various files. The > defaults were more cautious and I used an auto-props files based on > the wiki version. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 > > The old repository's still around as > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 > > I renamed it so that people would work with it by mistake. If, for > some hard-to-imagine reason, you have a working copy that you want to > run against it, you should be able to do an svn switch --relocate on > your working copy and be back in shape. In fact, it might be a good > time to give it a try.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Sat Jun 30 22:21:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 22:21:25 -0400 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: <18054.63942.316904.413911@almost.alerce.com> References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: <5F53A433-BAA9-431D-A0C5-5955690D0B73@gmx.net> On Jun 30, 2007, at 8:48 PM, George Hartzell wrote: > I renamed it so that people would work with it by mistake. If, for > some hard-to-imagine reason, you have a working copy that you want to > run against it, It's not so hard to imagine - checking out the entire repository takes a long time. > you should be able to do an svn switch --relocate on > your working copy and be back in shape. In fact, it might be a good > time to give it a try.... It doesn't work: svn: The repository at 'svn+ssh://dev.open-bio.org/home/hartzell/ bioperl_take2' has uuid '31277767-6726-dc11-ab4c-0019e3f901d6', but the WC has '27e854f1-f323-dc11-8c1b-0019e3f901d6' You can't relocate to a totally new repository (relocating to bioperl_take1 does work though). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 30 22:39:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 21:39:27 -0500 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: <7C6FD6C9-CBED-40D3-BA90-4B34F79E6DE0@uiuc.edu> There are a few CPAN modules available; here's one: http://search.cpan.org/~dwheeler/SVN-Notify-2.66/lib/SVN/Notify.pm chris On Jun 30, 2007, at 8:25 PM, Jason Stajich wrote: > Thanks George - > I also did > chgrp -R bioperl /home/hartzell/bioperl_take? > to make sure the group permission was set right. > > We may also want to do a chmod g+s on all the dirs in there as well > so that permissions are preserved when this gets deployed for real. > > If anyone wants to make some changes to files and commit them, as > well as make some branches/tags to play around a little bit since > we'll likely throw this away and do it again from locked down version > from CVS at some appointed time. > > Do you know how to have svn commit messages generate summary emails > as well? > > -j > On Jun 30, 2007, at 5:48 PM, George Hartzell wrote: > >> >> There's a second cut at the subversion repository. I've done a >> better >> job of setting svn:keywords and svn:eol-style on various files. The >> defaults were more cautious and I used an auto-props files based on >> the wiki version. >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 >> >> The old repository's still around as >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 >> >> I renamed it so that people would work with it by mistake. If, for >> some hard-to-imagine reason, you have a working copy that you want to >> run against it, you should be able to do an svn switch --relocate on >> your working copy and be back in shape. In fact, it might be a good >> time to give it a try.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sat Jun 30 22:46:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 21:46:05 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4686CC04.6000403@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> <4686CC04.6000403@sendu.me.uk> Message-ID: On Jun 30, 2007, at 4:32 PM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: >>> [...] >>> Very definitely the latter. The key benefit of my approach is >>> that the organisation stays as is and that a snapshot of the >>> repository remains a single directory of modules in Bio so that >>> people don't have to 'install' Bioperl, they can still just >>> uncompress the archive (or check out the package from svn) and >>> point their PERL5LIB to the root dir of the package. > [snip] >> In this sense, I understand a release pumpkin will generate ~900 >> packages to upload to CPAN? How much hassle is that compared to >> what uploading a bioperl release means right now? > > I'd have to investigate. I did my uploads using the PAUSE website, > which for 900 packages would be unfeasible. Will have to see if the > process can be automated. Not that they would care one way or another but maybe we should contact the CPAN maintainers to get their thoughts. They might have some ideas... >> How brittle is all the Build.PL code that will be needed to >> automate all of this, and how difficult will it be to maintain? >> For example, if someone adds in 10 new modules, what Build.PL- >> related work will need to be done? > > Well, my plan will be that once the work is done, you won't need to > touch the Build.PL code again. My intent is that the pumpkin can > just type one command and not think about anything. > > As for the reality, I won't know until I think about it properly > and experiment. A good experiment for a branch. I still think this could be accomplished step-wise; for instance run a quick test using something with a simple dependency tree like Bio::Root::Root (only needs RootI), finish up with Bio::Root*, then work down into PrimarySeq, Seq, etc. Submit them to CPAN piecemeal or in batches (all Bio::Seq*, so on). If the Build.PL, etc are to be generated on the fly then maybe there should be a simple way of registering or matching tests to modules (or vice versa) to ease the pain, particularly for new code. chris From hlapp at gmx.net Sat Jun 30 22:56:04 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 22:56:04 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: It turns out that both files are also present on the release-0-9-3, bioperl-1-0-0, bioperl-1-0-alpha, and bioperl-1-0-alpha2-rc tags, so add $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/release-0-9-3/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-0/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha2-rc/t/data/ HUMBETGLOA.fasta to the post-processing commands. -hilmar On Jun 30, 2007, at 8:40 PM, Chris Fields wrote: > Checkout worked for me (Mac OS X) using both: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/ > > so removing the offending file worked (good catch!). Haven't run a > full co but probably isn't necessary. > > chris > > On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote: > >> >> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: >> >>> I just did the experiment, and filename-insensitivity seems to be >>> breaking something. >>> >>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. >>> >>> I reformatted a memory stick to be case sensitive and co of >>> >>> bioperl/bioperl-live/tags/release-0-9-2/t >>> >>> worked, then I made a directory in my home dir (normal mac thing) >>> and >>> got the same error as above. >> >> You picked up a rename of a file from lower case extension to >> upper case extension. Unfortunately, there are several months >> between adding the upper-case and removing the lower-case version. >> >> We can reconstruct what happened with this using svn log on the >> directory (this does not require a checkout): >> >> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ >> bioperl/bioperl-live/trunk/t/data >> >> Searching for HUMBETGLOA yields the following two commits that >> added one and removed the other: >> >> --------------------------------------------------------------------- >> --- >> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 >> lines >> Changed paths: >> M /bioperl-live/trunk/t/SearchIO.t >> A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA >> A /bioperl-live/trunk/t/data/cysprot1.FASTA >> >> added tests for FASTA >> >> --------------------------------------------------------------------- >> --- >> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 >> lines >> Changed paths: >> A /bioperl-live/trunk/t/data/HUMBETGLOA.fa >> D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta >> >> renaming file to avoid clobbering on windows >> >> Unfortunately, both files are in the tag (again, no checkout >> required): >> >> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ >> bioperl-live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i >> fasta >> HUMBETGLOA.FASTA >> HUMBETGLOA.fasta >> >> We can remove the offending version from the repository (again, >> without needing a checkout): >> >> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- >> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta >> >> I did this, and now the tag checks out fine on OSX. Can anyone >> confirm? >> >> (BTW the ability to operate on the repository w/o needing a >> checkout is another advantage of svn) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Fri Jun 1 04:06:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 01 Jun 2007 09:06:04 +0100 Subject: [Bioperl-l] ClustalW Score? In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><465E9B58.1020403@sendu.me.uk> <49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org> <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> Message-ID: <465FD36C.5060603@sendu.me.uk> Kevin Brown wrote: >> you're right --- it is not really my code, I was just >> elaborating Kevin's example --- it would probably need to be >> more specific or perhaps the last Score seen is sufficient >> for what one is trying to capture? > > I took that code from a pairwise clustal alignment script that I wrote > to deal with aligning a bunch of short sequences against a long one to > see where they line up at. When all of them were fed to Clustal the > short sequences all ended up aligned to each other and not well aligned > to the longer sequence. I only saw one score in the output from the > pairwise, so that is what I used to find a reasonable value. Ok, well I've hedged my bets and used both. Now commited to CVS. From jy at genseq.co.uk Fri Jun 1 22:39:48 2007 From: jy at genseq.co.uk (Jean-Yves Sireau) Date: Sat, 2 Jun 2007 10:39:48 +0800 Subject: [Bioperl-l] Genseq Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com> Dear List members, I would like to let you know of the formation of Genseq Ltd., a bioinformatics company that will (in time!) offer genome sequencing to high net worth individuals and bioinformatic analysis of the sequence data to detect predisposition to illness. The company's website is www.genseq.co.uk Genseq would be willing to sponsor bioperl, whether financially or by providing resources, notably for any bioperl-related activities in the Asia Pacific region. Genseq's bioinformatics team will be based in Cyberjaya (Malaysia), and we are in particular interested to promote bioperl in Malaysia. We are also actively recruiting at the moment in Malaysia and India. If there was sufficient demand, we would be willing to organise a bioperl conference in Cyberjaya at the Cyberview Lodge (www.cyberview-lodge.com), which would be the ideal place for such a conference in Malaysia. Looking forward to your comments, suggestions and proposals. Best regards Jean-Yves Sireau -- Jean-Yves Sireau CEO, Genseq Ltd. www.genseq.co.uk From cjfields at uiuc.edu Sat Jun 2 01:16:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 00:16:05 -0500 Subject: [Bioperl-l] EUtilities overhaul started Message-ID: To anyone using Bio::DB::EUilities, I am in the midst of a major overhaul to the various EUtilities tools and to Bio::DB::GenericWebDBI (the latter which I am forming into more or less a test bed for other database interfaces). I'm about 80% done at this point, and will likely start committing changes this coming week. The overall interface will change (something I had warned about in the Bio::DB::EUtilities POD) but I am hoping it will be more intuitive and easier to use in the long run. I'll describe the overall redesign and use in an upcoming HOWTO (as recommended by Brian a while back). If anyone has any suggestions/ideas/flames, please let me know! Cheers! chris From cjfields at uiuc.edu Sat Jun 2 10:39:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 09:39:25 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: Yes, there are a few odd issues, though that's one I've not heard of yet. You might try one of the sub-nucleotide databases (nuccore, nucest, nucgss). I'll try looking into it and (if necessary) pester NCBI about it. I'll pass this on to the mail list to see if anyone else knows about the problem. chris On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: > Hi Chris, > > Thanks for your work on EUtilities. > For a production task, I used EUtilitities directly (given your > announced overhaul). I noticed a recent problem at NCBI (reported two > weeks ago to NCBI, no reply yet). Possibly you may run into this with > testing: if you ePOST gi ids to the EU server and then use this set in > Esearch (using the query key) no results are returned for the > nucleotide database. > ESearches like "db=$db%23$QueryKey" typically fail if the $db is > nucleotide (but work f $db='protein'). The XML output has Count 0 and > an empty QueryTranslationSet for db=nucleotide only. > For completeness, I attach a simple test script I used. > > > Best regards, > Bernd > > > On 6/2/07, Chris Fields wrote: >> To anyone using Bio::DB::EUilities, >> >> I am in the midst of a major overhaul to the various EUtilities tools >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> more or less a test bed for other database interfaces). I'm about >> 80% done at this point, and will likely start committing changes this >> coming week. >> >> The overall interface will change (something I had warned about in >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> intuitive and easier to use in the long run. I'll describe the >> overall redesign and use in an upcoming HOWTO (as recommended by >> Brian a while back). >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> Cheers! >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Jun 3 00:51:57 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 23:51:57 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu> I can confirm this; however it only relates to the use of history with esearch and nucleotide (use of the history with other eutils seems to work fine); retrieving sequences via efetch is not affected. If I find out anything more I'll post something on the mail list. chris On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote: > I can confirm that using the correct sub-nucleotide database works > (nuccore in my case). > This seems to be a quite recent change/bug at NCBI. Until recently, > db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid > db. > It is not optimal to have to choose the sub-database and the searches > work via the Entrez web-interface. Note that this problem is related > to the ESearch and db=nucleotide. > > bernd > > On 6/2/07, Chris Fields wrote: >> Yes, there are a few odd issues, though that's one I've not heard of >> yet. You might try one of the sub-nucleotide databases (nuccore, >> nucest, nucgss). >> >> I'll try looking into it and (if necessary) pester NCBI about it. >> I'll pass this on to the mail list to see if anyone else knows about >> the problem. >> >> chris >> >> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: >> >> > Hi Chris, >> > >> > Thanks for your work on EUtilities. >> > For a production task, I used EUtilitities directly (given your >> > announced overhaul). I noticed a recent problem at NCBI >> (reported two >> > weeks ago to NCBI, no reply yet). Possibly you may run into this >> with >> > testing: if you ePOST gi ids to the EU server and then use this >> set in >> > Esearch (using the query key) no results are returned for the >> > nucleotide database. >> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is >> > nucleotide (but work f $db='protein'). The XML output has Count >> 0 and >> > an empty QueryTranslationSet for db=nucleotide only. >> > For completeness, I attach a simple test script I used. >> > >> > >> > Best regards, >> > Bernd >> > >> > >> > On 6/2/07, Chris Fields wrote: >> >> To anyone using Bio::DB::EUilities, >> >> >> >> I am in the midst of a major overhaul to the various EUtilities >> tools >> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> >> more or less a test bed for other database interfaces). I'm about >> >> 80% done at this point, and will likely start committing >> changes this >> >> coming week. >> >> >> >> The overall interface will change (something I had warned about in >> >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> >> intuitive and easier to use in the long run. I'll describe the >> >> overall redesign and use in an upcoming HOWTO (as recommended by >> >> Brian a while back). >> >> >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> >> >> Cheers! >> >> >> >> chris >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From basu at pharm.stonybrook.edu Sun Jun 3 10:44:18 2007 From: basu at pharm.stonybrook.edu (Siddhartha Basu) Date: Sun, 03 Jun 2007 10:44:18 -0400 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: On Sat, 2 Jun 2007 00:16:05 -0500 Chris Fields wrote: > To anyone using Bio::DB::EUilities, > > I am in the midst of a major overhaul to the various >EUtilities tools > and to Bio::DB::GenericWebDBI (the latter which I am >forming into > more or less a test bed for other database interfaces). > I'm about > 80% done at this point, and will likely start committing >changes this > coming week. > > The overall interface will change (something I had >warned about in > the Bio::DB::EUtilities POD) but I am hoping it will be >more > intuitive and easier to use in the long run. I'll >describe the > overall redesign and use in an upcoming HOWTO (as >recommended by > Brian a while back). Hi chris, Being a frequent user of EUtilities, hopefully this api facelift and upcoming howto will definitely be more helpful. Anyway, one thing i noticed that for each eutil call such as efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has to be instantiated. And thereafter it cannot be set during runtime such as $eutils->id('ids'), for example.... my $eutils = Bio::DB::Eutilities->new ( -id => $id, -eutil => 'esummary', -db => 'protein', ); my $ct = $eutils->get_response->content(); ## -- now i cannot do this... $eutils->id($newid); my $ct = $eutils->get_response->content(); Is the new api going to address something along this line or is there currently anyway to reuse the object. Thanks again for this nice toolkit. -siddhartha > > If anyone has any suggestions/ideas/flames, please let >me know! > > Cheers! > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Jun 3 19:52:39 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 3 Jun 2007 18:52:39 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu> On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote: > ... > Hi chris, > Being a frequent user of EUtilities, hopefully this api facelift > and upcoming howto will definitely be more helpful. > Anyway, one thing i noticed that for each eutil call such as > efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has > to be > instantiated. And thereafter it cannot be set during runtime such as > $eutils->id('ids'), for example.... > > my $eutils = Bio::DB::Eutilities->new ( -id => $id, > -eutil => 'esummary', > -db => 'protein', > ); > my $ct = $eutils->get_response->content(); > > ## -- now i cannot do this... > $eutils->id($newid); > my $ct = $eutils->get_response->content(); I'll have to check up on that, though changing id() should work with the old API. It won't matter with the new API (it works fine), but it is still troubling... > Is the new api going to address something along this line or is > there currently anyway to reuse > the object. > Thanks again for this nice toolkit. > > -siddhartha The old API was based upon the idea of creating discrete user agents for each eutil to retrieve data. The problem with the old interface is it attempts to do too much (take care of parameters, set up requests, retrieve responses, parse data, etc), and many tasks required instantiating a new EUtilities object. I was never really satisfied with it. The new interface is a composition of three classes: the web user agent (LWP::UserAgent), a class encapsulating parameter handling, and a parser class (all which can be used independently if needed). When parameters change a new request is made 'lazily' (i.e. only when needed). Similarly, when data is requested after any parameter change a new parser instance is created and the new response is parsed. With that in mind you can now do the following: ---------------------------------------- my @params = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA1', -retmax => 100); my $eutil = Bio::DB::EUtilities->new(@params); # no need to get response first; get_ids() calls that if needed my @ids = $eutil->get_ids; # below changes only those parameters, leaves all others set as before $eutil->set_parameters(-eutil => 'efetch', -id => \@ids, -retmode => 'text', -rettype => 'fasta'); # sends streamed content directly to a file $eutil->get_response(-content_file => 'seqs.fas'); # or to a LWP::UserAgent-supported request callback $eutil->get_response(-content_cb => \&my_cb); my @newparams = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA2', -retmax => 100); # Resets eutility to passed parameters (or undef) $eutil->reset_parameters(@newparams); # retrieve new IDs my @new_ids = $eutil->get_ids; ---------------------------------------- Note the same eutil object is used for all of the above, so to answer your last question, yes, you should be able to create data pipelines using the same object if necessary. chris From sac at bioperl.org Mon Jun 4 13:56:57 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 4 Jun 2007 10:56:57 -0700 Subject: [Bioperl-l] question about Bio::Restriction::Analysis In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com> Hi Apurva, I'm cc:ing the list to let others know you have found performance issues with Bio::Restriction::Analysis. Ideally, we should focus on addressing those issues rather than fixing a module that is now deprecated. But taking a quick look at my Bio::Tools::RestrictionEnzyme module, I'm not sure why HpaII would give slower performance relative to other non-ambiguous cutters. This enzyme has a 4-base recognition sequence CCGG, and if you're feeding it a large CG-rich input sequence, that could be a factor. To test, you might try using some other 4-base cutters that aren't CG-rich (TaqI, TasI) or try some other input sequences. There is no special flag to indicate that the enzyme is non-ambiguous. The module handles that automatically. Good luck, Steve On 6/4/07, Apurva Narechania wrote: > Hi Rob and Steve, > > I was hoping you could answer a quick performance question regarding > the Bio::Restriction::Analysis module. I have found that though this > module works well, it is considerably slower than the deprecated > Bio::Tools::RestrictionEnzyme. I see that there are two algorithms > available to your module, and since I am using HpaII, a non-ambiguous > enzyme, I thought I might find similar performance to the older, > deprecated module, but I do not. Is it possible that I am not setting > the non-ambiguous flag correctly? Does it need to be set in the first > place? > > As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have > found instances where it is inaccurate, especially in calculating > fragments of extremely small size 1-5 base pairs, so I would like to > use your module if possible. It just seems slow to me. > > Can you clarify? > > I have copied my code below since it is a short, simple script. > > Thanks! > Apurva Narechania > Ware Lab > Cold Spring Harbor Labs > > ---------- > > #!/usr/bin/perl > > # This program generates a fasta of restriction frags given an > # input fasta and a restriction cut site > > use Getopt::Std; > use Bio::Seq; > use Bio::SeqIO; > use strict; > > use Bio::Tools::RestrictionEnzyme; > > my %opts = (); > getopts ('f:', \%opts); > my $fasta = $opts{'f'}; > > # read fasta file > my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta"); > > my $x = 0; > while (my $sequence_obj = $seqin -> next_seq()){ > $x++; > my $id = $sequence_obj->id(); > > print STDERR "$x Working on $id\n"; > > # generate the rx object > my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII'); > > my @frags = $ra->cut_seq($sequence_obj); > > my $counter = 0; > foreach my $frag (@frags){ > $counter++; > my $length = length ($frag); > print ">$id.$counter length=$length\n$frag\n"; > } > > } > > From anhthu.tieu at gsf.de Tue Jun 5 04:14:09 2007 From: anhthu.tieu at gsf.de (Tieu, Anh-Thu) Date: Tue, 5 Jun 2007 10:14:09 +0200 Subject: [Bioperl-l] problems with image maps and IE 6 or higher Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de> Hi, I have a problem using the bioperl image maps function with the IE6 or and higher browser. It might be a more general problem with IE6 rather than with bioperl, but as I used bioperl to create my image maps, I thought I could still post this problem here and ask for people's opinion. I wondered if anyone else faced the same problem and if possible if anyone could share their experiences and their solutions.

scale alignment5 integration_pt gene intron1 usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/>

> > onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " > alt="scale " target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="alignment5 " alt="alignment5 " > target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="integration_pt " alt="integration_pt " > target="_blank"/> > onclick="javascript:void(zmenu( 'Nphs1 ', > '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', ' > stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " > alt="gene " target="_blank"/> > onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: > 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a > lt="exon1 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: > 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1 > " alt="intron1 " target="_blank"/> > onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: > 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a > lt="exon2 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: > 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2 > .. >
> > > This is part of the code I used in my HTML file to display the image map > and it really runs beautifully > with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 > the clickable pop-ups do not appear/ work. > > I appreciate any help and would like to thank everyone for their help. > > Best regards, > > > Anh-Thu > ________________________________________________________________________ > GSF-Forschungszentrum > > Ingolst?dter Landstr. 1 > > 85764 M?nchen-Neuherberg, Germany > > Chairman of Supervisory Board: MinDir Dr. Peter Lange > > Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum > > Register of Societies: Amtsgericht M?nchen HRB 6466 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Tue Jun 5 11:28:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 10:28:24 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Martin, The example file you give in the bioperl bugzilla report has several blank annotation lines which may lead to additional problems. When the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, DEFINITION, etc) then it expects there will also be relevant data (text descriptions) accompanying it; I assume the BioPython parser expects likewise though I may be wrong. AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- compliant. GenBank records lacking text either have a '.' instead or are left out entirely: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html We could add a fix but you should probably contact the ApE developers and request that field names w/o text be left out or have '.' added. chris On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > Ezequiel Panepucci wrote: >>> genbank entry = parser.parse(fhandle) >> >> there is a space character between "genbank" and "entry". >> It is a syntax error. >> I suppose you meant "genbank_entry" ? > > Yes, the next command was right and has shown the error. Sorry, I > forgot > to delete the first attempt. ;-) > >>>> genbank_entry = parser.parse(fhandle) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", > line 187, in parse > self._scanner.feed(handle, self._consumer) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 360, in feed > self._feed_first_line(consumer, self.line) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 835, in _feed_first_line > assert False, \ > AssertionError: Did not recognise the LOCUS line layout: > LOCUS 6499 bp ds-DNA linear 02-AUG-2006 > >>>> > > Martin > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From stewarta at nmrc.navy.mil Tue Jun 5 11:34:14 2007 From: stewarta at nmrc.navy.mil (Andrew Stewart) Date: Tue, 5 Jun 2007 11:34:14 -0400 Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil> I see bidirectional mutator methods for source, type, strand, etc. in the Bio::DB::GFF::Feature documentation but I see that ->attributes is only able to get and not set the feature attributes. Is there no way to modify the attributes of a Bio::DB::GFF::Feature live? -- Andrew Stewart Research Assistant, Genomics Team Navy Medical Research Center (NMRC) Biological Defense Research Directorate (BDRD) BDRD Annex 12300 Washington Avenue, 2nd Floor Rockville, MD 20852 email: stewarta at nmrc.navy.mil phone: 301-231-6700 Ext 270 From cjfields at uiuc.edu Tue Jun 5 12:07:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 11:07:41 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: One thing I missed which explains the biopython error: the LOCUS line is missing the locus identifier (see the NCBI example record link). This doesn't choke the bioperl parser but it appears to stop the biopython parser in it's tracks (maybe a feature instead of a bug!). You should try adding a unique identifier (maybe the name of the file or record) to the LOCUS line to see if it works: LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 The bioperl parser in CVS writes out the correct alphabet when this is added: LOCUS testfile 6499 bp ds-DNA linear 02- AUG-2006 I'll try adding a warning to the bioperl parser for this. chris On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > Martin, > > The example file you give in the bioperl bugzilla report has several > blank annotation lines which may lead to additional problems. When > the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, > DEFINITION, etc) then it expects there will also be relevant data > (text descriptions) accompanying it; I assume the BioPython parser > expects likewise though I may be wrong. > > AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- > compliant. GenBank records lacking text either have a '.' instead or > are left out entirely: > > http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html > > We could add a fix but you should probably contact the ApE developers > and request that field names w/o text be left out or have '.' added. > > chris > > On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > >> Ezequiel Panepucci wrote: >>>> genbank entry = parser.parse(fhandle) >>> >>> there is a space character between "genbank" and "entry". >>> It is a syntax error. >>> I suppose you meant "genbank_entry" ? >> >> Yes, the next command was right and has shown the error. Sorry, I >> forgot >> to delete the first attempt. ;-) >> >>>>> genbank_entry = parser.parse(fhandle) >> Traceback (most recent call last): >> File "", line 1, in ? >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >> line 187, in parse >> self._scanner.feed(handle, self._consumer) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 360, in feed >> self._feed_first_line(consumer, self.line) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 835, in _feed_first_line >> assert False, \ >> AssertionError: Did not recognise the LOCUS line layout: >> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >> >>>>> >> >> Martin >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Tue Jun 5 22:00:34 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Tue, 05 Jun 2007 22:00:34 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: I am wondering if I knew what this error message exactly meant, if I could discern my error. I don't see much difference in this program and programs that worked. Can I assume that the new worked because an index file exists? I don't know how the filehandle UTR_TT_GENES gets involved. Maybe I should use some other module, but I really would like to have get_Seq_by_id functionality. The error message: Dpse ortholog = Dpse_GA17307 fetching GA17307 Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, line 4. Relevant code: #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; # my $db = Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol ogs_Dpse_genes.fa', -makeid => \&make_my_id); ... ... ... my $pse_obj = $db->get_Seq_by_id('GA17307'); my $pse_sequence = $pse_obj->seq; Nick Staffa Telephone: 919-316-4569 (NIEHS: 6-4569) Scientific Computing Support Group NIEHS Information Technology Support Services Contract (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina From jason at bioperl.org Tue Jun 5 23:12:40 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 5 Jun 2007 20:12:40 -0700 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: the file handle is probably not important, Perl just reports this if there is a filehandle open. more importantly what is on line 84.... my guess is you are trying to get a sequence out and it doesn't exist - some error code around the lines getting the sequence out would be helpful. On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote: > I am wondering if I knew what this error message exactly meant, if > I could > discern my error. > I don't see much difference in this program and programs that worked. > Can I assume that the new worked because an index file exists? > I don't know how the filehandle UTR_TT_GENES gets involved. > Maybe I should use some other module, but I really would like to have > get_Seq_by_id functionality. > > The error message: > Dpse ortholog = Dpse_GA17307 > fetching GA17307 > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl > line 84, > line 4. > > Relevant code: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > # > my $db = > Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ > TT_orthol > ogs_Dpse_genes.fa', > -makeid => \&make_my_id); > ... > ... > ... > my $pse_obj = $db->get_Seq_by_id('GA17307'); > my $pse_sequence = $pse_obj->seq; > > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2613 bytes Desc: not available URL: From torsten.seemann at infotech.monash.edu.au Wed Jun 6 02:06:37 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 6 Jun 2007 16:06:37 +1000 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: Nick, > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, The error makes it pretty clear. You are calling the ->seq method on an undefined value, ie. $pse_obj. > my $pse_obj = $db->get_Seq_by_id('GA17307'); # check we got something! die "sequence not in database" unless $pse_obj; > my $pse_sequence = $pse_obj->seq; -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From shameer at ncbs.res.in Wed Jun 6 02:27:42 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST) Subject: [Bioperl-l] Validation of files using BioPerl Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Dear All, How to validate an input file in fasta/PIR/GenPept/PDB format using Bioperl ? (This is to avoid unnecessary files to be submitted to servers by new users). Any module available ? Many thanks in advance, -- Shameer Khadar From cjfields at uiuc.edu Wed Jun 6 08:37:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 07:37:28 -0500 Subject: [Bioperl-l] Validation of files using BioPerl In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu> It has been discussed but never coded. I believe if it passes through the Bio::SeqIO parser it's generally considered validly formatted (spacing, balanced quotes), though it doesn't specifically check FT keys and qualifiers for invalid ones, look for missing annotation, check taxonomy, etc. As long as the end sequence mark (//) is present for every file, you cold try parsing the file into chunks (read with 'local $/ = '//';') and tossing the seq chunks as a filehandle (via IO::String) to a Bio::SeqIO object wrapped in an eval block (the parser resets $/, so it should work). Follow the eval with a check of $@ for caught errors. It might get tedious for big sequences... chris On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote: > Dear All, > > How to validate an input file in fasta/PIR/GenPept/PDB format using > Bioperl ? (This is to avoid unnecessary files to be submitted to > servers > by new users). Any module available ? > > Many thanks in advance, > -- > Shameer Khadar > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Wed Jun 6 10:40:49 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Wed, 06 Jun 2007 10:40:49 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: Indeed. One must know what is actually in his header, AND one must write the appropriate make_id subroutine AND one must specify the exact ID. THEN things might work. And they did! THANK YOU On 6/6/07 2:06 AM, "Torsten Seemann" wrote: > Nick, > >> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, > > The error makes it pretty clear. You are calling the ->seq method on > an undefined value, ie. $pse_obj. > >> my $pse_obj = $db->get_Seq_by_id('GA17307'); > > # check we got something! > die "sequence not in database" unless $pse_obj; > >> my $pse_sequence = $pse_obj->seq; > From jaudall at gmail.com Wed Jun 6 17:51:33 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:51:33 -0600 Subject: [Bioperl-l] blastxml interation Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being possibly useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number. Thanks in advance for any suggestions. Josh From dmessina at wustl.edu Wed Jun 6 18:18:26 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 6 Jun 2007 17:18:26 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: I think you want to look at the hits(), num_hits() and no_hits_found () methods. There is a private method _next_iteration_index() which should do what you asked for, but num_hits() looks like the better way. By the way, hits() and num_hits() are listed on the Deobfuscator as having no documentation. This (as the below shows) is incorrect and is due to some nonstandard formatting issues which I will correct. _next_iteration_index() isn't listed on the Deobfuscator because it's a private method. Hope this helps! Dave hits() This method overrides Bio::Search::Result::GenericResult::hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, all 'new' hits for all iterations are returned. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::hits num_hits() This method overrides Bio::Search::Result::GenericResult::num_hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, calling num_hits() returns the number of 'new' hits for each iteration. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::num_hits no_hits_found() Usage : $nohits = $blast->no_hits_found( $iteration_number ); Purpose : Get boolean indicator indicating whether or not any hits were present in the report. This is NOT the same as determining the number of hits via the hits() method, which will return zero hits if there were no hits in the report or if all hits were filtered out during the parse. Thus, this method can be used to distinguish these possibilities for hitless reports generated when filtering. Returns : Boolean Argument : (optional) integer indicating the iteration number (PSI- BLAST) If iteration number is not specified and this is a PSI- BLAST result, then this method will return true only if all iterations had no hits found. From apurva at cshl.edu Wed Jun 6 19:51:45 2007 From: apurva at cshl.edu (Apurva Narechania) Date: Wed, 6 Jun 2007 19:51:45 -0400 Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu> Hi, I was hoping you could confirm and give me some feedback on an issue I think I've found with the Bio::Restriction::Analysis module. I am using the enzyme AciI, a non-palindromic restriction enzyme with a 5' C | CGC 3' recognition site. The module should search both the forward and the reverse complement strings in the case of a non- palindromic enzyme. I have found that the this works only intermittently. For example, the following sequence: GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG CGCGGTTG GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG GCTGGTAT TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC AGGACACC GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA CAAAGTGA CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG CAATGTAT ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA TAATGCTA GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC Should digest into 10 fragments using this enzyme, but the module produces only 7. Could you please confirm this behavior, and if observed, suggest some possible fixes? This may be a bug in the _non_pal_enz method, or may be me overlooking something pretty obvious. Thanks, Apurva Narechania. From cjfields at uiuc.edu Wed Jun 6 20:51:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 19:51:00 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: Joshua, Just to make sure there is no confusion, do you mean a Bio::Search::Iteration::IterationI-based object? The iteration tags have multiple meanings apparently in BLAST XML output (multiple queries, multiple PSI-BLAST iterations). The current SearchIO::blastxml parser returns multiple Bio::Search::Result::BlastResult objects based on the iterations, so PSI-BLAST output is treated as multiple BLAST reports regardless (i.e. no Iteration objects). This is something I want to rectify but it may not be a easy fix. chris On Jun 6, 2007, at 5:18 PM, David Messina wrote: > I think you want to look at the hits(), num_hits() and no_hits_found > () methods. There is a private method _next_iteration_index() which > should do what you asked for, but num_hits() looks like the better > way. > > By the way, hits() and num_hits() are listed on the Deobfuscator as > having no documentation. This (as the below shows) is incorrect and > is due to some nonstandard formatting issues which I will correct. > _next_iteration_index() isn't listed on the Deobfuscator because it's > a private method. > > > Hope this helps! > Dave > > > hits() > > This method overrides Bio::Search::Result::GenericResult::hits to take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, all 'new' hits for all iterations > are returned. > These are the hits that did not occur in a previous iteration. > See Also: Bio::Search::Result::GenericResult::hits > > num_hits() > > This method overrides Bio::Search::Result::GenericResult::num_hits to > take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, calling num_hits() returns the > number of > 'new' hits for each iteration. These are the hits that did not occur > in a previous iteration. > See Also: Bio::Search::Result::GenericResult::num_hits > > no_hits_found() > > Usage : $nohits = $blast->no_hits_found( $iteration_number ); > Purpose : Get boolean indicator indicating whether or not any hits > were present in the report. > This is NOT the same as determining the number of > hits via > the hits() method, which will return zero hits if there > were no > hits in the report or if all hits were filtered out > during the parse. > > Thus, this method can be used to distinguish these > possibilities > for hitless reports generated when filtering. > > Returns : Boolean > Argument : (optional) integer indicating the iteration number (PSI- > BLAST) > If iteration number is not specified and this is a PSI- > BLAST result, > then this method will return true only if all > iterations had > no hits found. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 6 20:45:14 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 6 Jun 2007 20:45:14 -0400 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db Message-ID: I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. A schema in PostgreSQL is more or less a namespace for database objects (tables, indexes, views, etc) within a database. (A database in PostgreSQL is similar to the concept of a user in Oracle or MySQL, and therefore for the latter two schemas are synonymous with a user. [Not sure I'm still up-to-date on this for MySQL, but at least that's what I recall.]) When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you specify the schema in which BioSQL resides using the --schema option. If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call also accepts a -schema named parameter, and Bio::DB::DBContextI objects have a $dbc->schema() property for getting/setting the schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may also add the property to the .bioperldb connection parameter file (-schema => 'yourschemahere'). Thanks for Brian Osborne for being the instigator (and tester, and for adding the code to load_ncbi_taxonomy.pl - I came too late). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jaudall at gmail.com Wed Jun 6 17:41:08 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:41:08 -0600 Subject: [Bioperl-l] blastxml interation number Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being very useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number, otherwise I'm suggesting that an iteration_count feature be added to the Result object. Thanks in advance for any suggestions. Josh From holland at ebi.ac.uk Thu Jun 7 03:33:25 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Thu, 07 Jun 2007 08:33:25 +0100 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db In-Reply-To: References: Message-ID: <4667B4C5.6070107@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sounds great. BioJava users shouldn't need to change anything to get this to work as PostgreSQL JDBC connection objects already require you to specify a schema. cheers, Richard Hilmar Lapp wrote: > I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. > A schema in PostgreSQL is more or less a namespace for database objects > (tables, indexes, views, etc) within a database. > > (A database in PostgreSQL is similar to the concept of a user in Oracle > or MySQL, and therefore for the latter two schemas are synonymous with a > user. [Not sure I'm still up-to-date on this for MySQL, but at least > that's what I recall.]) > > When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you > specify the schema in which BioSQL resides using the --schema option. > > If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call > also accepts a -schema named parameter, and Bio::DB::DBContextI objects > have a $dbc->schema() property for getting/setting the schema, > Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may > also add the property to the .bioperldb connection parameter file > (-schema => 'yourschemahere'). > > Thanks for Brian Osborne for being the instigator (and tester, and for > adding the code to load_ncbi_taxonomy.pl - I came too late). > > -hilmar > --=========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij W/+0iO/ZsNDn1pLuf5yXbYA= =asUn -----END PGP SIGNATURE----- From mmokrejs at ribosome.natur.cuni.cz Thu Jun 7 10:26:44 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 07 Jun 2007 16:26:44 +0200 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz> Hi, Chris Fields wrote: > One thing I missed which explains the biopython error: the LOCUS line is > missing the locus identifier (see the NCBI example record link). This > doesn't choke the bioperl parser but it appears to stop the biopython > parser in it's tracks (maybe a feature instead of a bug!). > > You should try adding a unique identifier (maybe the name of the file or > record) to the LOCUS line to see if it works: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > The bioperl parser in CVS writes out the correct alphabet when this is > added: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > I'll try adding a warning to the bioperl parser for this. I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me emphasize the LOCUS line now contains LOCUS pRL 5428 bp ds-DNA linear 07-JUN-2007 which still does not comply with the line you have proposed. But it can be parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new in the bugzilla record #2305. Martin > > chris > > On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > >> Martin, >> >> The example file you give in the bioperl bugzilla report has several >> blank annotation lines which may lead to additional problems. When >> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, >> DEFINITION, etc) then it expects there will also be relevant data >> (text descriptions) accompanying it; I assume the BioPython parser >> expects likewise though I may be wrong. >> >> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- >> compliant. GenBank records lacking text either have a '.' instead or >> are left out entirely: >> >> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html >> >> We could add a fix but you should probably contact the ApE developers >> and request that field names w/o text be left out or have '.' added. >> >> chris >> >> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: >> >>> Ezequiel Panepucci wrote: >>>>> genbank entry = parser.parse(fhandle) >>>> >>>> there is a space character between "genbank" and "entry". >>>> It is a syntax error. >>>> I suppose you meant "genbank_entry" ? >>> >>> Yes, the next command was right and has shown the error. Sorry, I >>> forgot >>> to delete the first attempt. ;-) >>> >>>>>> genbank_entry = parser.parse(fhandle) >>> Traceback (most recent call last): >>> File "", line 1, in ? >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >>> line 187, in parse >>> self._scanner.feed(handle, self._consumer) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 360, in feed >>> self._feed_first_line(consumer, self.line) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 835, in _feed_first_line >>> assert False, \ >>> AssertionError: Did not recognise the LOCUS line layout: >>> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >>> >>>>>> >>> >>> Martin >>> _______________________________________________ >>> BioPython mailing list - BioPython at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biopython >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From cjfields at uiuc.edu Thu Jun 7 11:31:45 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 7 Jun 2007 10:31:45 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> <466815A4.9060505@ribosome.natur.cuni.cz> Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu> On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote: > Hi, > > Chris Fields wrote: >> One thing I missed which explains the biopython error: the LOCUS >> line is missing the locus identifier (see the NCBI example record >> link). This doesn't choke the bioperl parser but it appears to >> stop the biopython parser in it's tracks (maybe a feature instead >> of a bug!). >> You should try adding a unique identifier (maybe the name of the >> file or record) to the LOCUS line to see if it works: >> LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 >> The bioperl parser in CVS writes out the correct alphabet when >> this is added: >> LOCUS testfile 6499 bp ds-DNA linear 02- >> AUG-2006 >> I'll try adding a warning to the bioperl parser for this. > > I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 > but let me > emphasize the LOCUS line now contains > LOCUS pRL 5428 bp ds-DNA linear > 07-JUN-2007 > > > which still does not comply with the line you have proposed. But it > can be > parsed by bioperl-live from cvs. Is it still wrong? Testcase as > pRL.gb-new > in the bugzilla record #2305. > > Martin That should work. There isn't a strict uniqueness test (that would require caching and isn't worth the trouble IMHO), though it's required you add something unique for the accession/locus if you plan on indexing them in the future. Parsing GenBank data produced from third-party software is problematic at best; there seems to be no steadfast rule with GenBank output for some programs, even though the specification is plainly stated in the NCBI release notes. My take on that is to have a stricter (read:follows release notes) GenBank parser which passes off the data in the record to default handler methods. A user could then subjugate the defined handlers with their own by subclassing the default handler class and overloading the methods or adding their own code references directly. chris ... From rich at thevillas.eclipse.co.uk Fri Jun 8 07:00:45 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 12:00:45 +0100 Subject: [Bioperl-l] protparam Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk> Hi, I noticed that in April someone asked whether there was a bioperl mod for obtaining protein sequence related properties using protparam. I have a module that could potentially be submitted to bioperl for this purpose. Does anybody have any thoughts on whether it should go in? Example script and the module are at: http://81.5.159.173/webshare/ Cheers Rich From cjfields at uiuc.edu Fri Jun 8 08:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 07:37:27 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Richard, We'll gladly add this in, though it'll need to be bioperlized (inherit Bio::Root::Root). We also generally ask for tests but it should be easy to write up a quick test suite using any protein seq. If you can could you add some bioperl-like POD to the module (i.e. SYNOPSIS, AUTHOR, DESCRIPTION, etc)? thanks! chris On Jun 8, 2007, at 6:00 AM, richard wrote: > > Hi, > > I noticed that in April someone asked whether there was a bioperl mod > for obtaining protein sequence related properties using protparam. > I have a module that could potentially be submitted to bioperl for > this > purpose. Does anybody have any thoughts on whether it should go in? > > Example script and the module are at: > > http://81.5.159.173/webshare/ > > > Cheers > Rich > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From mmokrejs at ribosome.natur.cuni.cz Fri Jun 8 07:09:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 08 Jun 2007 13:09:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz> Hi, how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for Bio::Graphics::FeatureFile does not help me in this way. The information is in the file, so I want just to extract the features to a GFF format, probably somewhere the sequence has to be stored ... Is there a tool so I can convert it automatically? ;) This would be great. I can't make the GFF manually for every file. Other programs draw plasmid maps also automatically from the GenBank formatted input so how can I do it in bioperl? Thanks for help, Martin From shameer at ncbs.res.in Fri Jun 8 10:11:00 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST) Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in> Richard, I asked for protparam module in bioperl ! Thats a good job. Cheers, SK > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > >> >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From dmessina at wustl.edu Fri Jun 8 10:58:20 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 8 Jun 2007 09:58:20 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Hi Martin, You're in luck -- the BioPerl core distribution includes two scripts for doing just that: genbank2gff genbank2gff3 Look in the scripts directory of the distro. Also, there is a *huge* amount of documentation and examples on the BioPerl website. http://www.bioperl.org/wiki/HOWTOs Reading those, reading the FAQ, and searching the mailing list archives are where I look first when I don't know how to do something in BioPerl. Dave -- Dave Messina Senior Analyst, Assembly Group Genome Sequencing Center Washington University St. Louis, MO From rich at thevillas.eclipse.co.uk Fri Jun 8 11:51:21 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 16:51:21 +0100 Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk> Hi, ok, great, that's no problem. I'll add the POD and bioperlize it, thanks Rich Chris Fields wrote: > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > > >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Fri Jun 8 13:45:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 12:45:17 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> <46697AF9.2090502@thevillas.eclipse.co.uk> Message-ID: Another issue is namespace. I suggest Bio::Tools::ProtParam, though there may be some others out there. We can add support for direct Bio::Seq/PrimarySeq input and other odds and ends once it's committed. Good work! chris On Jun 8, 2007, at 10:51 AM, richard wrote: > > Hi, > > ok, great, that's no problem. I'll add the POD and bioperlize it, > > thanks > Rich > > Chris Fields wrote: >> Richard, >> >> We'll gladly add this in, though it'll need to be bioperlized >> (inherit Bio::Root::Root). We also generally ask for tests but it >> should be easy to write up a quick test suite using any protein seq. >> >> If you can could you add some bioperl-like POD to the module (i.e. >> SYNOPSIS, AUTHOR, DESCRIPTION, etc)? >> >> thanks! >> >> chris >> >> On Jun 8, 2007, at 6:00 AM, richard wrote: >> >> >>> Hi, >>> >>> I noticed that in April someone asked whether there was a bioperl >>> mod >>> for obtaining protein sequence related properties using protparam. >>> I have a module that could potentially be submitted to bioperl for >>> this >>> purpose. Does anybody have any thoughts on whether it should go in? >>> >>> Example script and the module are at: >>> >>> http://81.5.159.173/webshare/ >>> >>> >>> Cheers >>> Rich >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 11 07:30:24 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 11 Jun 2007 07:30:24 -0400 Subject: [Bioperl-l] script to load ITIS taxonomy Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Hi all - I added a script to load the ITIS taxonomy (www.itis.gov) into the phylodb module. It is called load_itis_taxonomy.pl and is in the scripts/ directory. It is independent of BioPerl right now (the ITIS download is either a MS SQL Server or an Informix dump - no kidding), but I'm hoping that at some point support for this can be integrated into Bio::TreeIO. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 11 08:24:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 11 Jun 2007 07:24:50 -0500 Subject: [Bioperl-l] script to load ITIS taxonomy In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu> On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote: > Hi all - > > I added a script to load the ITIS taxonomy (www.itis.gov) into the > phylodb module. It is called load_itis_taxonomy.pl and is in the > scripts/ directory. > > It is independent of BioPerl right now (the ITIS download is either a > MS SQL Server or an Informix dump - no kidding), but I'm hoping that > at some point support for this can be integrated into Bio::TreeIO. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== I second the TreeIO support. Anyone up for it? chris From ryanx07 at hotmail.com Mon Jun 11 11:24:31 2007 From: ryanx07 at hotmail.com (L Xu) Date: Mon, 11 Jun 2007 10:24:31 -0500 Subject: [Bioperl-l] basic questions Message-ID: I just started to learn BioPerl by reading the BioPerl Tutorial on the BioPerl website. By trying the 1st example on my window, use Bio::Perl; $seq_object = get_sequence('swiss',"ID ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); I got the error as the following: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: t8.pl:7 I cannot figure out where is wrong but cannot find the solution on the web. Could someone help me please? Also, this lead to my 2nd question: is there a way to search in the archieve of the current list? Thanks so much R ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Like puzzles? Play free games & earn great prizes. Play Clink now. http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2 From dmessina at wustl.edu Mon Jun 11 12:34:29 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 11:34:29 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu> The example code works here, but I'm on OS X. Could you tell us which version of Perl and BioPerl you are using, and which operating system? Are you getting anything in the roa1.fasta file? > is there a way to search in the archieve of the current list? http://www.bioperl.org/wiki/Mailing_lists Dave From dmessina at wustl.edu Mon Jun 11 14:48:23 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 13:48:23 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Hi, Please use 'Reply All' so everyone on the list can follow the discussion. Try adding the following line after the line that starts with $seq_object: print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; And then run the program again. What do you get? Could you post a complete printout of what you're doing? Dave On Jun 11, 2007, at 11:45 AM, L Xu wrote: > I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > activeperl 5.8.8.819 Thank you very much. From johnsonm at gmail.com Mon Jun 11 20:45:13 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 11 Jun 2007 19:45:13 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) Message-ID: This bit in Bio::SeqFeature::Gene::Exon is causing me some problems trying to extend Bio::Tools::Glimmer to handle 'wraparound' genes (circular genomes): sub location { my ($self,$value) = @_; if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) { $self->throw("split or compound location is not allowed ". "for an object of type " . ref($self)); } return $self->SUPER::location($value); } That seems to be there all the way back to the initial revision (checked in by Hilmar). I presume it's there because of code like this ( from the seq() method in Bio::SeqFeature::Generic): # assumming our seq object is sensible, it should not have to yank # the entire sequence out here. my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); That's not going to work too well with a feature that has a Bio::Location::Split location. Fixing it up seems straightforward, if a bit hackish. Something like: my $seq; if (ref($self->location()) eq 'Bio::Location::Split')) { my $seqstring; my @sublocs = $self->location()->sub_Location(); foreach my $subloc (@sublocs) { $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), $subloc->end())->seq(); } my $seq = Bio::Seq->new( -id => $self->{'_gsf_seq'}->display_id(), -seq => $seqstring ); } else { $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); } I don't see any companion to trunc() in Bio::PrimarySeqI for joining sequences. A join() would be handy, and make the above cleaner. Comments, suggestions, rotten fruit? From torsten.seemann at infotech.monash.edu.au Tue Jun 12 02:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 12 Jun 2007 16:18:27 +1000 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: Mark, > if (ref($self->location()) eq 'Bio::Location::Split')) { > my $seqstring; > my @sublocs = $self->location()->sub_Location(); > > foreach my $subloc (@sublocs) { > $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), > $subloc->end())->seq(); > } Can you use the ->spliced_seq() method to do this? http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From pengchy at yahoo.com.cn Tue Jun 12 03:00:46 2007 From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=) Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST) Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com> hi all, Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141 Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, < DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. shell returned 2 when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond: TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr x/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. Is anyone else meet the same problem? Is it a bug for TFBS package? Best wishes! Sincerely, Pengcheng --------------------------------- ????????????????3.5G??????20M?????? From bix at sendu.me.uk Tue Jun 12 03:32:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 12 Jun 2007 08:32:02 +0100 Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com> References: <66745.92089.qm@web15205.mail.cnb.yahoo.com> Message-ID: <466E4BF2.7020504@sendu.me.uk> ? ?? wrote: > hi all, > > Today, I download the TFBS package from > http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the > files contained in the TFBS and Ext directories to directory > "C:\perl\site\lib", then put Ext under the TFBS directory. I run the > example script1.pl, but a wrong message respond: > > Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC You have to follow the installation instructions in the README file. Copying the files out is insufficient - you have to 'make'. From ryanx07 at hotmail.com Tue Jun 12 07:30:09 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 06:30:09 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Message-ID: Here is the code: use Bio::Perl; $seq_object = get_sequence('swiss',"ROA1_HUMAN"); print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; write_sequence(">roa1.fasta",'fasta',$seq_object); The output looks like the same as the previous version: Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\~Scripts>perl test.pl ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: test.pl:7 ----------------------------------------------------------- Thanks. >From: David Messina >To: L Xu >CC: BioPerl list >Subject: Re: [Bioperl-l] basic questions >Date: Mon, 11 Jun 2007 13:48:23 -0500 > >Hi, > >Please use 'Reply All' so everyone on the list can follow the discussion. > >Try adding the following line after the line that starts with $seq_object: > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > >And then run the program again. What do you get? Could you post a complete >printout of what you're doing? > > >Dave > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and >>activeperl 5.8.8.819 Thank you very much. > _________________________________________________________________ Picture this ? share your photos and you could win big! http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us From pengchy at yahoo.com.cn Tue Jun 12 10:33:15 2007 From: pengchy at yahoo.com.cn (Pengcheng Yang) Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?= In-Reply-To: Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com> I got the same questions. I guess that the swissprote database has some problems! code: use Bio::DB::SwissProt; $sp = new Bio::DB::SwissProt; $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" the mesage: ------------- EXCEPTION ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180 STACK Bio::DB::WebDBSeqI::get_Seq_by_id C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154 STACK toplevel t.pl:7 -------------------------------------- --- L Xu ????: > Here is the code: > > use Bio::Perl; > $seq_object = get_sequence('swiss',"ROA1_HUMAN"); > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > write_sequence(">roa1.fasta",'fasta',$seq_object); > > The output looks like the same as the previous version: > > Microsoft Windows XP [Version 5.1.2600] > (C) Copyright 1985-2001 Microsoft Corp. > > C:\~Scripts>perl test.pl > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK: Error::throw > STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 > STACK: Bio::SeqIO::swiss::next_seq > C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id > C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 > 3 > STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 > STACK: test.pl:7 > ----------------------------------------------------------- > > Thanks. > > > > > > >From: David Messina > >To: L Xu > >CC: BioPerl list > >Subject: Re: [Bioperl-l] basic questions > >Date: Mon, 11 Jun 2007 13:48:23 -0500 > > > >Hi, > > > >Please use 'Reply All' so everyone on the list can follow the > discussion. > > > >Try adding the following line after the line that starts with > $seq_object: > > > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > > > >And then run the program again. What do you get? Could you post a > complete > >printout of what you're doing? > > > > > >Dave > > > > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: > >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > >>activeperl 5.8.8.819 Thank you very much. > > > > _________________________________________________________________ > Picture this ?share your photos and you could win big! > http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Best wishes! Sincerely, Pengcheng ___________________________________________________________ ????????????????3.5G??????20M?????? http://cn.mail.yahoo.com From drummike at gmail.com Tue Jun 12 11:49:36 2007 From: drummike at gmail.com (Mike Williams) Date: Tue, 12 Jun 2007 11:49:36 -0400 Subject: [Bioperl-l] =?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?= In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com> References: <936780.8655.qm@web15215.mail.cnb.yahoo.com> Message-ID: On 6/12/07, Pengcheng Yang wrote: > I got the same questions. > I guess that the swissprote database has some problems! > code: > use Bio::DB::SwissProt; > $sp = new Bio::DB::SwissProt; > $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); > print ref($seq),"\t",$seq->display_id,"\n" > ------------- EXCEPTION ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK toplevel t.pl:7 This is a different problem. The id was not valid. If you change KPY1 to KPYK1 it works fine. $seq = $sp->get_Seq_by_id('KPYK1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" [mike at Wheatley]$ ./bio_quest2.pl Bio::Seq::RichSeq KPYK1_ECOLI If you got this example from the bio perl site would you please post the url? Seems to me this same problem has come up before, but I could not find it in the archives nor on the web site. Mike From ryanx07 at hotmail.com Tue Jun 12 11:42:28 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 10:42:28 -0500 Subject: [Bioperl-l] basic questions Message-ID: I tested another code (the 2nd test on the same machine) from the tutorial and got error again. I don't know what happened and please help. Thanks so much. ===========================================================Code: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection; my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; # prints name, recognition site, overhang } =========================================== Results: C:\~Scripts>perl t9.pl Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while "stric t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 236. = = = Original message = = = On Jun 11, 2007, at 11:45 AM, L Xu wrote: I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? activeperl 5.8.8.819 Thank you very much. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Need a break? Find your escape route with Live Search Maps. http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01 From limericksean at gmail.com Tue Jun 12 12:04:40 2007 From: limericksean at gmail.com (Sean O'Keeffe) Date: Tue, 12 Jun 2007 18:04:40 +0200 Subject: [Bioperl-l] gff2xml Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Hi all, I posted this on the gbrowse list earlier. I'm looking to convert gff data files into xml. Does anyone know of a module written to do this already? respect, sean. From johnsonm at gmail.com Tue Jun 12 12:10:45 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:10:45 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On 6/12/07, Torsten Seemann wrote: > Can you use the ->spliced_seq() method to do this? > > http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > --Tel +61 3 9905 9010 Actually, I'd forgotten about spliced_seq(). That seems like it will Do The Right Thing. It's just up to the invoker to call spliced_seq() instead of seq() as appropriate. So, is there any other code that will break if I modify Bio::SeqFeature::Gene::Exon::location to not throw an exception when encountering Bio::Location::SplitLocationI? I'm wondering if it's just a paranoid check or if it's there to guard against something. If the latter, I need to know what code to fix. I'll dig and look, but if anybody knows or has an idea, save me some time. I suppose I can just change it and see what tests start failing. 8) From dmessina at wustl.edu Tue Jun 12 12:11:36 2007 From: dmessina at wustl.edu (David Messina) Date: Tue, 12 Jun 2007 11:11:36 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu> Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps Perl wasn't seeing the second argument to get_sequence. And then your new program has the error 'Can't use string ("Bio::Restriction::EnzymeCollecti")' where the end of the word is cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks. Are there any example scripts that come with ActivePerl? If there are, and they run correctly, perhaps you could look to see how the line breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem -- anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl and make sure that you run the full test suite and that all of the tests pass. My guess is that something in your current setup is not quite right. Dave From cjfields at uiuc.edu Tue Jun 12 12:42:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 11:42:29 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs state that the Exon class is used to specifically describe exons, as the name implies. Exons are primarily eukaryotic in origin, so you shouldn't encounter wraparounds, and should not have split locations by definition (which likely explains the exception). Wouldn't a SeqFeature::Generic work just as well using a split location? chris From johnsonm at gmail.com Tue Jun 12 12:59:54 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:59:54 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: That's a good point. Both Bio::Tools::Glimmer and Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with a single Bio::SeqFeature::Gene::Exon, when parsing predictions for prokaryotic sequence (multiple exons for eukaryotic). There are eukaryotic and prokaryotic versions of both predictor families. Maybe the most elegant solution would be to simply modify both modules to only emit Bio::SeqFeature::Generic features when operating on prokaryotic mode output? Fix the data model and the problem goes away. 8) On 6/12/07, Chris Fields wrote: > > On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > > > On 6/12/07, Torsten Seemann > > wrote: > >> Can you use the ->spliced_seq() method to do this? > >> > >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ > >> SeqFeatureI.html#POD11 > >> > >> -- > >> --Torsten Seemann > >> --Victorian Bioinformatics Consortium, Monash University > >> --Tel +61 3 9905 9010 > > > > Actually, I'd forgotten about spliced_seq(). That seems like it > > will Do The Right Thing. It's just up to the invoker to call > > spliced_seq() instead of seq() as appropriate. > > So, is there any other code that will break if I modify > > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > > encountering Bio::Location::SplitLocationI? I'm wondering if it's > > just a paranoid check or if it's there to guard against something. If > > the latter, I need to know what code to fix. I'll dig and look, but > > if anybody knows or has an idea, save me some time. I suppose I can > > just change it and see what tests start failing. 8) > > I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to > describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs > state that the Exon class is used to specifically describe exons, as > the name implies. Exons are primarily eukaryotic in origin, so you > shouldn't encounter wraparounds, and should not have split locations > by definition (which likely explains the exception). > > Wouldn't a SeqFeature::Generic work just as well using a split location? > > chris > From ryanx07 at hotmail.com Tue Jun 12 13:17:18 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 12:17:18 -0500 Subject: [Bioperl-l] basic questions Message-ID: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820. However, both scripts generated the same error with my computer. I tested the code in another WinXP computer with the same versions of activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there any example scripts that come with ActivePerl? If there are,? and they run correctly, perhaps you could look to see how the line? breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl? and make sure that you run the full test suite and that all of the? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 13:51:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 12:51:47 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: This is an instance where 'use strict' would have shown the problem right away. You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: > I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 > build 820. > However, both scripts generated the same error with my computer. I > tested > the code in another WinXP computer with the same versions of > activePerl and > BioPerl, the one for the swissprot did work but the restriction enzyme > generated the same error. > > = = = Original message = = = > > Hmm, it almost looks like you're having an issue with line breaks. > > The 'swissprot stream with no ID' error made me think that perhaps? > Perl > wasn't seeing the second argument to get_sequence. And then your? new > program has the error 'Can't use string? > ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? > cut off. > > I don't know how ActivePerl handles Windows vs UNIX line breaks.? > Are? there > any example scripts that come with ActivePerl? If there are,? and > they run > correctly, perhaps you could look to see how the line? breaks are > done and > make sure the your program does it the same way. > > Other than that, I'm not seeing an obvious answer to your problem > --? anyone > else have a suggestion? > > Perhaps the easiest thing for you to do would be to reinstall > BioPerl? and > make sure that you run the full test suite and that all of the? > tests pass. > My guess is that something in your current setup is not? quite right. > > Dave > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only > on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Tue Jun 12 14:11:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 13:11:15 -0500 Subject: [Bioperl-l] basic questions Message-ID: Thank you very much, it did make the script advanced a bit but I got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the package. Thanks. = = = Original message = = = This is an instance where 'use strict' would have shown the problem? right away.? You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 820. However, both scripts generated the same error with my computer. I? tested the code in another WinXP computer with the same versions of? activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps?? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? there any example scripts that come with ActivePerl? If there are,? and? they run correctly, perhaps you could look to see how the line? breaks are? done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem? --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and make sure that you run the full test suite and that all of the?? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only? on MSN http://liveearth.msn.com?source=msntaglineliveearthhm _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 14:35:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 13:35:15 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu> Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme objects, each with its own name(). Using grouped methods like '$collection->cutters(6)' will retrieve a new EnzymeCollection containing all six-cutters from the original collection. You should use one of the EnzymeCollection accessor methods to retrieve the enzyme that you wanted first or iterate through them all. This works for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; } chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: > Thank you very much, it did make the script advanced a bit but I > got the following error: > > C:\~Scripts>perl t9.pl > Can't locate object method "name" via package > "Bio::Restriction::EnzymeCollectio > n" at t9.pl line 5, line 532. > > I checked the documentation , there is no "name" method for the > package. Thanks. From johnsonm at gmail.com Tue Jun 12 15:07:57 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 14:07:57 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: I'll wait a day, and if there is no opinion to the contrary, implement it this way. On 6/12/07, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) From torsten.seemann at infotech.monash.edu.au Tue Jun 12 20:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 13 Jun 2007 10:18:27 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: Sean > I posted this on the gbrowse list earlier. I'm looking to convert gff > data files into xml. Does anyone know of a module written to do this > already? What DTD do you want the XML to conform to? eg. ChadoXML, TinySeq XML, TIGR XML ... ? What program are you trying to get to load the XML? BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that you could use. There is a script "bp_seqconvert.pl -h" which comes with BioPerl which may be useful. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From hlapp at gmx.net Tue Jun 12 20:55:57 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:55:57 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net> I think it was just trying to guard against people trying to do stupid things. I'm actually not sure that representing locations on a circular genome using split locations really is the best thing. I'm wondering whether one shouldn't rather introduce a CircularLocation object (though obviously it isn't the location that's circular...). Just a thought. In the end, if you have a way to make this work that you feel comfortable with than go for it. -hilmar On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Jun 12 20:57:06 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:57:06 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> I like that. Don't force a model to do what you want if it doesn't really apply anyway. -hilmar On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) > > On 6/12/07, Chris Fields wrote: >> >> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >> >>> On 6/12/07, Torsten Seemann >>> wrote: >>>> Can you use the ->spliced_seq() method to do this? >>>> >>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>> SeqFeatureI.html#POD11 >>>> >>>> -- >>>> --Torsten Seemann >>>> --Victorian Bioinformatics Consortium, Monash University >>>> --Tel +61 3 9905 9010 >>> >>> Actually, I'd forgotten about spliced_seq(). That seems like it >>> will Do The Right Thing. It's just up to the invoker to call >>> spliced_seq() instead of seq() as appropriate. >>> So, is there any other code that will break if I modify >>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when >>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>> just a paranoid check or if it's there to guard against >>> something. If >>> the latter, I need to know what code to fix. I'll dig and look, but >>> if anybody knows or has an idea, save me some time. I suppose I can >>> just change it and see what tests start failing. 8) >> >> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >> state that the Exon class is used to specifically describe exons, as >> the name implies. Exons are primarily eukaryotic in origin, so you >> shouldn't encounter wraparounds, and should not have split locations >> by definition (which likely explains the exception). >> >> Wouldn't a SeqFeature::Generic work just as well using a split >> location? >> >> chris >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Jun 12 21:20:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 20:20:41 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> References: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu> It will be interesting to see if bioperl handles wrap-around split locations via spliced_seq() and other methods. I can't see why it wouldn't but one never knows. Might be something to add to location tests at some point... chris On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote: > I like that. Don't force a model to do what you want if it doesn't > really apply anyway. > > -hilmar > > On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > >> That's a good point. Both Bio::Tools::Glimmer and >> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with >> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for >> prokaryotic sequence (multiple exons for eukaryotic). There are >> eukaryotic and prokaryotic versions of both predictor families. >> Maybe >> the most elegant solution would be to simply modify both modules to >> only emit Bio::SeqFeature::Generic features when operating on >> prokaryotic mode output? Fix the data model and the problem goes >> away. 8) >> >> On 6/12/07, Chris Fields wrote: >>> >>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >>> >>>> On 6/12/07, Torsten Seemann >>>> wrote: >>>>> Can you use the ->spliced_seq() method to do this? >>>>> >>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>>> SeqFeatureI.html#POD11 >>>>> >>>>> -- >>>>> --Torsten Seemann >>>>> --Victorian Bioinformatics Consortium, Monash University >>>>> --Tel +61 3 9905 9010 >>>> >>>> Actually, I'd forgotten about spliced_seq(). That seems >>>> like it >>>> will Do The Right Thing. It's just up to the invoker to call >>>> spliced_seq() instead of seq() as appropriate. >>>> So, is there any other code that will break if I modify >>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception >>>> when >>>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>>> just a paranoid check or if it's there to guard against >>>> something. If >>>> the latter, I need to know what code to fix. I'll dig and look, >>>> but >>>> if anybody knows or has an idea, save me some time. I suppose I >>>> can >>>> just change it and see what tests start failing. 8) >>> >>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >>> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >>> state that the Exon class is used to specifically describe exons, as >>> the name implies. Exons are primarily eukaryotic in origin, so you >>> shouldn't encounter wraparounds, and should not have split locations >>> by definition (which likely explains the exception). >>> >>> Wouldn't a SeqFeature::Generic work just as well using a split >>> location? >>> >>> chris >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Wed Jun 13 08:16:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 07:16:15 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: Thanks so much, Chris, it works now. All the codes I tested were copied from Bioperl Tutorial. Why did they have such problems, because of the platform issue or different versions of BioPerl? I tested so far 6 scripts, three work and three don't. Here is the problem for the 3rd failed script: ================================= use strict; use Bio::Tools::Run::RemoteBlast; my $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); my $r = $remote_blast->submit_blast("d1.fa"); my $rc; while ( my @rids = $remote_blast->each_rid ) { for my $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); } } print "$rc\n"; #I just want to print sth here before parsing the result =========================================================d1.fa >example CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC =========================================================result C:\>perl t13.pl -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- Terminating on signal SIGINT(2) C:\> Please help me to correct the problem, thanks. = = = Original message = = = Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, each with its own name().? Using grouped methods like? '$collection->cutters(6)' will retrieve a new EnzymeCollection? containing all six-cutters from the original collection.? You should? use one of the EnzymeCollection accessor methods to retrieve the? enzyme that you wanted first or iterate through them all.? This works? for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme) ?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: Thank you very much, it did make the script advanced a bit but I? got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package? "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the? package. Thanks. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Make every IM count. Download Messenger and join the i?m Initiative now. It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07 From cjfields at uiuc.edu Wed Jun 13 10:41:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 09:41:55 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu> Judging by the output it looks like you have no network access or can't connect to the server (what remoteblast needs). Make sure you don't need proxy settings. To preempt the next question, no, I'm not going to explain what a proxy is. The RemoteBlast docs show how to set them, and Google is a wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... From ryanx07 at hotmail.com Wed Jun 13 11:01:07 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 10:01:07 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: I do have the internet connection bu not use the proxy server. I tested the network connection with ping command (below). The ncbi website does not response. Is there any special network setting needed for connecting the ncbi website? Thank you so much. C:\>ping www.yahoo.com Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 Ping statistics for 69.147.114.210: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 312ms, Maximum = 363ms, Average = 338ms C:\>ping www.ncbi.nlm.nih.gov Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: Request timed out. Request timed out. Request timed out. Request timed out. Ping statistics for 130.14.29.110: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), = = = Original message = = = Judging by the output it looks like you have no network access or? can't connect to the server (what remoteblast needs).? Make sure you? don't need proxy settings. To preempt the next question, no, I'm not going to explain what a? proxy is.? The RemoteBlast docs show how to set them, and Google is a? wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: ... -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- ... ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Wed Jun 13 12:14:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 11:14:22 -0500 Subject: [Bioperl-l] method naming Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Some quick questions on method naming. I couldn't find this on the mail list previously and just want some opinions. 1) Is there any preference on how to name a method that returns a list of class instances vs. data? I have seen 'each' (each_Location, each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. simple (hits, hsps). 2) Do we want have methods which return objects have the object name in Title Case (each_Location, get_Seq_by_id, etc) or does it really matter? chris From dmessina at wustl.edu Wed Jun 13 12:41:53 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 13 Jun 2007 11:41:53 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). I'd prefer 'get_all' because it's more intuitive to me what the method is doing. 'Each' is too programmer-y. > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? I like Title Case because it reinforces the notion that what you're getting back is a specific object with that name (Seq) rather than the generic thing that the name represents (AGTCTGTGATAT, the actual sequence as a string). Dave From hlapp at gmx.net Wed Jun 13 13:03:59 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 13:03:59 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> We set a convention a while back on how to name these. It is implemented in the bioperl.lisp file (too bad no one is using emacs any more these days - it's a great editor), and in fact we started a renaming campaign (not sure when that was) on the SeqI and SeqFeatureI classes (you'll still see the old names aliased). However, we never got to finish the clean up. The convention was to use get_{ClassName}s, and get_all_{ClassName}s if there is a difference to the former (mostly because of hierarchical data; for example features can be nested, and get_all_SeqFeatures returns them all flattened out, while get_SeqFeatures returns only the top objects), and for modifying add_ {ClassName} and remove_{ClassName}s. The class name was to be in title case to emphasize the fact that it is an array of object you'd be getting back (and what kind of objects). If it is strings or any other scalar type, the name would be in lower case. -hilmar On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > Some quick questions on method naming. I couldn't find this on the > mail list previously and just want some opinions. > > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). > > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 13:19:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 12:19:43 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: Sounds good. I agree with Dave also one the use of 'each', as it's a bit ambiguous (seems to imply iteration as opposed to returning a whole list). We probably need to post this somewhere on the wiki for future reference; maybe in Advanced BioPerl? I'll add this in shortly. chris On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), and in fact we started > a renaming campaign (not sure when that was) on the SeqI and > SeqFeatureI classes (you'll still see the old names aliased). > > However, we never got to finish the clean up. > > The convention was to use get_{ClassName}s, and get_all_{ClassName} > s if there is a difference to the former (mostly because of > hierarchical data; for example features can be nested, and > get_all_SeqFeatures returns them all flattened out, while > get_SeqFeatures returns only the top objects), and for modifying > add_{ClassName} and remove_{ClassName}s. > > The class name was to be in title case to emphasize the fact that > it is an array of object you'd be getting back (and what kind of > objects). If it is strings or any other scalar type, the name would > be in lower case. > > -hilmar > > On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > >> Some quick questions on method naming. I couldn't find this on the >> mail list previously and just want some opinions. >> >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> >> 2) Do we want have methods which return objects have the object name >> in Title Case (each_Location, get_Seq_by_id, etc) or does it really >> matter? >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Jun 13 14:43:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 13:43:41 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <467036FC.8000505@watson.wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> <467036FC.8000505@watson.wustl.edu> Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu> On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote: > > David Messina wrote: >>> 1) Is there any preference on how to name a method that returns a >>> list of class instances vs. data? I have seen >>> 'each' (each_Location, >>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) >>> vs. >>> simple (hits, hsps). >>> >> >> I'd prefer 'get_all' because it's more intuitive to me what the >> method is doing. 'Each' is too programmer-y. >> >> >> > When I think 'get_all', I think of a method that returns a list of > objects at once. When I think of 'each', I think of a method that > returns a scalar but can be called multiple times to iterate over a > set of objects. Yep, hence the ambiguity issue (and my confusion). I think it was so you could both iterate and return a list using this: for my $obj ($seq->each_Class) {...} my @objs = $seq->each_Class; I use 'next' and 'get/get_all' as an iterator and get accessor (similar to how it's used in Bio::SearchIO): while (my $obj = $seq->next_Class) {...} my @objs = $seq->get_Class; # or get_all_Class for flattened lists which to me is much clearer. chris From mkiwala at watson.wustl.edu Wed Jun 13 14:27:08 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Wed, 13 Jun 2007 13:27:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> Message-ID: <467036FC.8000505@watson.wustl.edu> David Messina wrote: >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> > > I'd prefer 'get_all' because it's more intuitive to me what the > method is doing. 'Each' is too programmer-y. > > > When I think 'get_all', I think of a method that returns a list of objects at once. When I think of 'each', I think of a method that returns a scalar but can be called multiple times to iterate over a set of objects. From sac at bioperl.org Wed Jun 13 17:17:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 13 Jun 2007 14:17:27 -0700 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> On 6/13/07, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we could improve the visibility of bioperl.lisp. In truth, I had forgotten about it, though lit turns out I was loading an old version of it. (Btw, using the latest version of bioperl.lisp with xemacs 21.4.17, I don't get a bioperl menu item, though I can access bioperl functions via M-x. Suggestions?) I see bioperl.lisp is mentioned twice parenthetically in the advanced bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here would help. While we're at it, maybe we could add a bioperl.vi file to the distribution (if you can do such things with vi/vim). On 6/13/07, Chris Fields wrote: > We probably need to post this somewhere on the wiki for future > reference; maybe in Advanced BioPerl? I'll add this in shortly. Another idea: Add a method naming check to the set of audits we perform on CVS committed code. It could check for agreement with our conventions and warn if nothing was found (may not be a problem though). Steve From arareko at campus.iztacala.unam.mx Wed Jun 13 18:03:34 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 13 Jun 2007 17:03:34 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <467069B6.7080003@campus.iztacala.unam.mx> By the time of the 1.5.2 release, I jumped onto the idea of creating a BioPerl template for Komodo. Chris F handed me one he had already made but in the end I didn't had enough spare time to get into it. If someone wants to give it a try please let ChrisF/me know. Regards, Mauricio. Steve Chervitz wrote: > On 6/13/07, Hilmar Lapp wrote: >> We set a convention a while back on how to name these. It is >> implemented in the bioperl.lisp file (too bad no one is using emacs >> any more these days - it's a great editor), > > As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we > could improve the visibility of bioperl.lisp. In truth, I had > forgotten about it, though lit turns out I was loading an old version > of it. (Btw, using the latest version of bioperl.lisp with xemacs > 21.4.17, I don't get a bioperl menu item, though I can access bioperl > functions via M-x. Suggestions?) > > I see bioperl.lisp is mentioned twice parenthetically in the advanced > bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here > would help. While we're at it, maybe we could add a bioperl.vi file to > the distribution (if you can do such things with vi/vim). > > On 6/13/07, Chris Fields wrote: >> We probably need to post this somewhere on the wiki for future >> reference; maybe in Advanced BioPerl? I'll add this in shortly. > > Another idea: Add a method naming check to the set of audits we > perform on CVS committed code. It could check for agreement with our > conventions and warn if nothing was found (may not be a problem > though). > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From hlapp at gmx.net Wed Jun 13 18:41:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 18:41:45 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > using the latest version of bioperl.lisp with xemacs 21.4.17, I > don't get a bioperl menu item I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item it showing up just beautifully. (BTW it also have very nice icons for various functions - though I always feel guilty for using keystrokes instead.) Is GNU Emacs finally winning this? ;) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Wed Jun 13 18:58:51 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 15:58:51 -0700 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Post your dualing screenshots to the wiki! I had started a couple of IDE pages on the wiki a while ago: http://bioperl.org/wiki/Emacs http://bioperl.org/wiki/Emacs_template http://bioperl.org/wiki/Vi If anyone is feeling excited enough to write a few more IDE pages and link them into a common article that would be great. -jason On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > > On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > >> using the latest version of bioperl.lisp with xemacs 21.4.17, I >> don't get a bioperl menu item > > I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item > it showing up just beautifully. (BTW it also have very nice icons for > various functions - though I always feel guilty for using keystrokes > instead.) > > Is GNU Emacs finally winning this? ;) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Wed Jun 13 19:08:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:08:17 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: Would probably be worth writing one up for Komodo since Mauricio, Sendu, and I use it. I updated the Advanced BioPerl page with Hilmar's methods suggestions/ rules (as well as a few I found dating back a number of years on the mail list). It might be worth a glance in case there are any changes needed: http://www.bioperl.org/wiki/Advanced_BioPerl chris On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > Post your dualing screenshots to the wiki! > > I had started a couple of IDE pages on the wiki a while ago: > http://bioperl.org/wiki/Emacs > http://bioperl.org/wiki/Emacs_template > http://bioperl.org/wiki/Vi > > If anyone is feeling excited enough to write a few more IDE pages > and link them into a common article that would be great. > > -jason > On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > >> >> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >> >>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>> don't get a bioperl menu item >> >> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item >> it showing up just beautifully. (BTW it also have very nice icons for >> various functions - though I always feel guilty for using keystrokes >> instead.) >> >> Is GNU Emacs finally winning this? ;) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 13 19:28:17 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 19:28:17 -0400 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Thanks Chris for doing this - looks great. The only comment that I have is that method names should never start with a capital letter. If the getter/setter is for a single object (as opposed to a list), the name should probably be similar (if not identical) to the class being expected and returned, but lower-case. E.g., $feature->location(), $seq->species() etc -hilmar On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > Would probably be worth writing one up for Komodo since Mauricio, > Sendu, and I use it. > > I updated the Advanced BioPerl page with Hilmar's methods > suggestions/rules (as well as a few I found dating back a number of > years on the mail list). It might be worth a glance in case there > are any changes needed: > > http://www.bioperl.org/wiki/Advanced_BioPerl > > chris > > On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > >> Post your dualing screenshots to the wiki! >> >> I had started a couple of IDE pages on the wiki a while ago: >> http://bioperl.org/wiki/Emacs >> http://bioperl.org/wiki/Emacs_template >> http://bioperl.org/wiki/Vi >> >> If anyone is feeling excited enough to write a few more IDE pages >> and link them into a common article that would be great. >> >> -jason >> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: >> >>> >>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >>> >>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>>> don't get a bioperl menu item >>> >>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu >>> item >>> it showing up just beautifully. (BTW it also have very nice icons >>> for >>> various functions - though I always feel guilty for using keystrokes >>> instead.) >>> >>> Is GNU Emacs finally winning this? ;) >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 19:44:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:44:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu> Agreed. We can definitely add that in. As we edge towards another release we try another round of cleaning up. I wouldn't mind pushing out another 1.5 point release before summer's up if possible; most of the tough work was done for v.1.5.2 by Sendu. chris On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote: > Thanks Chris for doing this - looks great. The only comment that I > have is that method names should never start with a capital letter. > If the getter/setter is for a single object (as opposed to a list), > the name should probably be similar (if not identical) to the class > being expected and returned, but lower-case. > > E.g., $feature->location(), $seq->species() etc > > -hilmar > > On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > >> Would probably be worth writing one up for Komodo since Mauricio, >> Sendu, and I use it. >> >> I updated the Advanced BioPerl page with Hilmar's methods >> suggestions/rules (as well as a few I found dating back a number of >> years on the mail list). It might be worth a glance in case there >> are any changes needed: >> >> http://www.bioperl.org/wiki/Advanced_BioPerl >> >> chris ... From johncumbers at gmail.com Wed Jun 13 20:20:42 2007 From: johncumbers at gmail.com (John Cumbers) Date: Wed, 13 Jun 2007 20:20:42 -0400 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? Message-ID: Hello, I have a simple problem, I'm trying to search a genome sequence for a motif, I then want to output a BED file to display all the locations of this motif on the UCSC Genome Browser. I could not find a script to do this, so I started to write my own. I'm new to perl and my code below was my attempt to read the sequence string and output the index bp of the start of each motif. With this I could build the BED file myself, which requires start and finish base pairs. For the first motif I can output the start index, but when I try and read the next one off the sequence it does not work. Instead I just get an output of a list of 1's. I realise that this is more a request for some simple perl help, but any help much appreciated. Best wishes, John $seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta"); #turn my FASTA file into a seq object. $sequence_as_a_string = $seq_object->seq(); #turn it into a string # search $sequence_as_a_string string for motif AAA as example # if found, return the index that it is found at while ($sequence_as_a_string =~ m/AAA/g) { print "Found '$&'. Next attempt at character " . pos($sequence_as_a_string)+1 . "\n"; } -- John Cumbers, Graduate Student Biology and Medicine Brown University, Box G-W Providence, Rhode Island, 02912, USA Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 UK to USA: 0207 617 7824 From cjfields at uiuc.edu Wed Jun 13 21:58:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 20:58:37 -0500 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: References: Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> This is answered in the FAQ (sorry if the URL wraps, but we don't like tinyurls): http://www.bioperl.org/wiki/ FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F chris On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > Hello, > > I have a simple problem, I'm trying to search a genome sequence for > a motif, > I then want to output a BED file to display all the locations of > this motif > on the UCSC Genome Browser. I could not find a script to do this, > so I > started to write my own. I'm new to perl and my code below was my > attempt > to read the sequence string and output the index bp of the start of > each > motif. With this I could build the BED file myself, which requires > start > and finish base pairs. > > For the first motif I can output the start index, but when I try > and read > the next one off the sequence it does not work. Instead I just get an > output of a list of 1's. I realise that this is more a request for > some > simple perl help, but any help much appreciated. > > Best wishes, > John > > > $seq_object = read_sequence > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > my FASTA file into a seq object. > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > # search $sequence_as_a_string string for motif AAA as example > # if found, return the index that it is found at > > while ($sequence_as_a_string =~ m/AAA/g) { > print "Found '$&'. Next attempt at character " . > pos($sequence_as_a_string)+1 . "\n"; > } > > > > -- > John Cumbers, Graduate Student > Biology and Medicine > Brown University, Box G-W > Providence, Rhode Island, 02912, USA > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > UK to USA: 0207 617 7824 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Jun 14 00:08:04 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 21:08:04 -0700 Subject: [Bioperl-l] wiki bulk update Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org> I did a some bulk update of Module pages for new modules that had been created since we last setup these pages: I outlined a little bit of what it requires behind the scenes. http://bioperl.org/wiki/BioPerl:Module_pages -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From bix at sendu.me.uk Thu Jun 14 05:35:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 10:35:00 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() Message-ID: <46710BC4.3060302@sendu.me.uk> It is preferable to have ->new syntax over new Object syntax, as outlined here: http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules I propose making this syntax change in all Bioperl POD documentation, so that the bad syntax is no longer suggested/encouraged. Any objections? If not, I'll go ahead and commit the changes. (affects 907 modules in live) Cheers, Sendu. From bix at sendu.me.uk Thu Jun 14 06:01:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 11:01:02 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <467111DE.6060800@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > > I propose making this syntax change in all Bioperl POD documentation, Actually, I propose making the change to code as well. From hlapp at gmx.net Thu Jun 14 08:47:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 08:47:47 -0400 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net> Sounds fine to me. People do go by working examples, and I've seen inconsistent examples leading to confusion on the end of newbies. -hilmar On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Jun 14 08:55:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 07:55:18 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: Sounds fine by me. I may actually start tackling some of the feature/ annotation overloading stuff myself to see what happens (I'll drop a notice when that occurs). chris On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. From tanzeem.mb at gmail.com Thu Jun 14 02:27:19 2007 From: tanzeem.mb at gmail.com (tanzeem) Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT) Subject: [Bioperl-l] Problem working with remoteblast submit method in webbrowser. Message-ID: <11114623.post@talk.nabble.com> I have a program which uses the Bio perl remoteblast module which compares a aminoacid fasta file with swissprot database. The submit_blast() method works successfully when run from commandline.But when the program is run from web browser it returns -1. I was trying to adapt the code from Remoteblast synopsis for my need. -- View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bix at sendu.me.uk Thu Jun 14 11:34:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 16:34:27 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <46716003.2030302@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > I propose making this syntax change in all Bioperl POD documentation, so > that the bad syntax is no longer suggested/encouraged. Any objections? > If not, I'll go ahead and commit the changes. > > (affects 907 modules in live) It was actually 515 modules & test scripts from live, 48 from run, 21 from db and 2 from network. Now committed. Before and after my changes these were failing: Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioGraphics.t 3 768 38 3 3-5 t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 1932 2106 t/Sopma.t 2 512 16 2 8 15 t/genbank.t 2 512 247 2 122-123 BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 (unintentional?). Sopma may not be a bug: results from server might have changed. genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 -> 1.164 not doing what the new tests expect. PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are you working on that, or can I fix those errors? Anyone care to look into those things? Cheers, Sendu. From cjfields at uiuc.edu Thu Jun 14 12:35:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 11:35:21 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: The genbank commit was mine so I'll look into it; may be that I hadn't finished up the bug work. If if have time I'll look into Sopma as well (unless you get to it first). chris On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD >> documentation, so >> that the bad syntax is no longer suggested/encouraged. Any >> objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ---------------------------------------------------------------------- > --------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm > 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, > are > you working on that, or can I fix those errors? > > Anyone care to look into those things? > > Cheers, > Sendu. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Thu Jun 14 12:43:43 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:43:43 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <4671703F.4010109@sheffield.ac.uk> I'm just wondering if anyone passes their modules through perltidy in order for them to have the same look/feel? If so, do you have a .perltidyrc file? Also, is it worth running the Bioperl modules through it? Nath From n.haigh at sheffield.ac.uk Thu Jun 14 12:36:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:36:37 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <46716E95.3090604@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD documentation, so >> that the bad syntax is no longer suggested/encouraged. Any objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ------------------------------------------------------------------------------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are > you working on that, or can I fix those errors? > I can fix these - although I'm still trying to get my new Debian 4.0 system up-to-speed so it might take me a little while! RE the PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't installed. However, would it be better to have Test::Pod in t/lib so that it runs on the user's system during installation or leave it as is? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS 7olroF2e6+4I0biz6fWRmu4= =s3hK -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 14 13:15:24 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:15:24 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <4671703F.4010109@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> Message-ID: <467177AC.8060104@sendu.me.uk> Nathan S. Haigh wrote: > I'm just wondering if anyone passes their modules through perltidy in > order for them to have the same look/feel? If so, do you have a > .perltidyrc file? Also, is it worth running the Bioperl modules through it? I don't use it, but I was contemplating the same thing. Chris uses it from time to time and I think we have a similar taste in style. But we'd have to hammer something out that was agreeable to everyone. From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 13:19:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:19:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz> David Messina wrote: > Hi Martin, > > You're in luck -- the BioPerl core distribution includes two scripts > for doing just that: > > genbank2gff Somehow these scripts were not installed for me on Gentoo, but I have then in the cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database, or better to say I have no intent to install that unknown thing, seems like an overkill for my case. I just want to render a plasmid map. > genbank2gff3 This one seems more promising but still with current cvs checkout I get... $ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb # Input: stdin Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, line 125. $ $ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. ID unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP. XX AC unknown; XX XX XX CC ApEinfo:methylated:0 ... Oh dear, I have just manually edited the files and still they are wrong? Oh no. :( > > Look in the scripts directory of the distro. > > Also, there is a *huge* amount of documentation and examples on the > BioPerl website. > > http://www.bioperl.org/wiki/HOWTOs You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-) > > Reading those, reading the FAQ, and searching the mailing list > archives are where I look first when I don't know how to do something > in BioPerl. > > > Dave > > -- > Dave Messina > Senior Analyst, Assembly Group > Genome Sequencing Center > Washington University > St. Louis, MO > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 99.gb URL: From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 13:23:28 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:23:28 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> Message-ID: <46717990.6040509@ribosome.natur.cuni.cz> Martin MOKREJ? wrote: >> Also, there is a *huge* amount of documentation and examples on the >> BioPerl website. >> >> http://www.bioperl.org/wiki/HOWTOs > > You mean > http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File > ? ;-) $ perl embl2picture.pl ~/99.gb | display - Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. $ The plasmid is a circular DNA, why is the diagram in linear? ;-) Martin From bix at sendu.me.uk Thu Jun 14 13:03:34 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:03:34 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716E95.3090604@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <46716E95.3090604@sheffield.ac.uk> Message-ID: <467174E6.1090001@sendu.me.uk> Nathan S. Haigh wrote: >> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are >> you working on that, or can I fix those errors? > > I can fix these - although I'm still trying to get my new Debian 4.0 > system up-to-speed so it might take me a little while! RE the > PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't > installed. However, would it be better to have Test::Pod in t/lib so > that it runs on the user's system during installation or leave it as is? Leave it as is. Every-day users don't need to check the syntax of the pod. In fact, it really only needs to be done once, prior to packaging up a new release. From n.haigh at sheffield.ac.uk Thu Jun 14 13:32:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:32:37 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46717BB5.8000706@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> I'm just wondering if anyone passes their modules through perltidy in >> order for them to have the same look/feel? If so, do you have a >> .perltidyrc file? Also, is it worth running the Bioperl modules >> through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. A starting place maybe Perl Best Practices by Damian Conway: http://www.oreilly.com/catalog/perlbp/ The perltidyrc file can e found here: http://www.perlmonks.org/?node_id=485885 I also found this nice thread with some ideas, inc some code that causes emacs to auto-perltidy everything you use cperl-mode with. I don't use emacs myself, ut here's the link if anyone is interested: http://www.perlmonks.org/?node_id=516501 Nath From johnsonm at gmail.com Thu Jun 14 13:38:31 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 12:38:31 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: The nice thing about Perl Tidy is that everybody can have their own config file. There could be a bioperl default config that gets applied at checkin time. Anybody that didn't like it could script checkouts to get run through their own config. Diffs might get a little hairy, but as long as you tidy before diffing, it shouldn't be too bad. Speaking of which....coding style is controversial enough, but since that's already been opened, what about CVS vs Subversion? 8) Some of the scripting for this sort of thing might be easer in Subversion. Though maybe something like Git would fit the developer model better (more support for distributed development). On 6/14/07, Sendu Bala wrote: > Nathan S. Haigh wrote: > > I'm just wondering if anyone passes their modules through perltidy in > > order for them to have the same look/feel? If so, do you have a > > .perltidyrc file? Also, is it worth running the Bioperl modules through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Thu Jun 14 13:39:39 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:39:39 +0100 Subject: [Bioperl-l] cvs changes in working copy Message-ID: <46717D5B.5040108@sheffield.ac.uk> Not sure if I'm being dense or if it's because I've been working with svn recently, but - how do I get a list of files that are different in my working copy compared to the repository? Cheers Nath From cjfields at uiuc.edu Thu Jun 14 13:46:38 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 12:46:38 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: Is 99.gb supposed to be a GenBank file? And you're loading it into embl2picture (which I assume takes EMBL format files)? Without example code we can easily make the wrong assumptions (i.e. that this is user error and not a BioPerl problem). Also, I don't believe the feature plotting scripts plot circular chromosomes/plasmids. If you want this functionality you'll have to code it for yourself. chris On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > Martin MOKREJ? wrote: > >>> Also, there is a *huge* amount of documentation and examples on the >>> BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> >> You mean >> http://www.bioperl.org/wiki/ >> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature > Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature > Bio::Location::Simple=HASH(0x893ebac): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature > Bio::Location::Simple=HASH(0x893e720): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature > Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > $ > > The plasmid is a circular DNA, why is the diagram in linear? ;-) > > Martin > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Jun 14 13:57:35 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 12:57:35 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <46717BB5.8000706@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk> Message-ID: <4671818F.5040902@campus.iztacala.unam.mx> I think a consensus .perltidyrc could be placed in the source distribution. Mauricio. Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. > > A starting place maybe Perl Best Practices by Damian Conway: > http://www.oreilly.com/catalog/perlbp/ > > > The perltidyrc file can e found here: > http://www.perlmonks.org/?node_id=485885 > > I also found this nice thread with some ideas, inc some code that causes > emacs to auto-perltidy everything you use cperl-mode with. I don't use > emacs myself, ut here's the link if anyone is interested: > http://www.perlmonks.org/?node_id=516501 > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Thu Jun 14 14:32:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 13:32:41 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: To chip in on this, I only use perltidy when I need to clean bioperl code up for debugging (particularly if blocks are hard to see) and just use the defaults. I agree it would be nice to have everything tidied up but it'll definitely need to be a consensus config file. About svn, I like the idea of eventually migrating to using it over CVS (I think BioPython and BioJava have plans to but I'm not sure) but I don't really know enough to say how feasible/difficult the migration path would be. Anyone know? chris On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > The nice thing about Perl Tidy is that everybody can have their > own config file. There could be a bioperl default config that gets > applied at checkin time. Anybody that didn't like it could script > checkouts to get run through their own config. Diffs might get a > little hairy, but as long as you tidy before diffing, it shouldn't be > too bad. Speaking of which....coding style is controversial enough, > but since that's already been opened, what about CVS vs Subversion? 8) > Some of the scripting for this sort of thing might be easer in > Subversion. Though maybe something like Git would fit the developer > model better (more support for distributed development). > > On 6/14/07, Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through >>> perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnsonm at gmail.com Thu Jun 14 14:46:24 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 13:46:24 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: If there was a default/standard/consensus bioperl perltidy config file, I would probably use it prior to checkin, on my own, so I could code in my schizophrenic style without worrying about starting any format wars. When I'm fixing or enhancing somebody else's code, I always try and adapt to whatever style they used, even if it grates on my nerves. I'd love to not have to worry about that with Bioperl. Of course, nobody will every agree on a standard, so it's probably a moot point. 8) On 6/14/07, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > > chris From jason at bioperl.org Thu Jun 14 15:00:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:00:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > Can we do any sort of massive conversion at some logical timepoint. Probably after a branch release or something? Because it basically means we're going to have differences on nearly every line which is going to make diff-ing difficult when debugging old/new versions. Maybe it is not a problem because we aren't introducing and new bugs! > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > It's doable but non-trivial. cvs2svn (python gah!) script exists to help in this. There are pros and cons to converting. There is a fair amount of documentation and other pointers out there that point to the CVS server for getting latest code so we'd need to think about whether we'd support some sort of backwards compatible SVN -> CVS for read-only or what. Mostly it will need someone to lead the charge - I made a go at doing it in the winter, but I really don't have the SVN-foo to make this work. We'd need someone with SVN experience to step up and help. You can always try and we can play with the converted repository for a while without making it the new code base. -j > chris > > On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > >> The nice thing about Perl Tidy is that everybody can have their >> own config file. There could be a bioperl default config that gets >> applied at checkin time. Anybody that didn't like it could script >> checkouts to get run through their own config. Diffs might get a >> little hairy, but as long as you tidy before diffing, it shouldn't be >> too bad. Speaking of which....coding style is controversial enough, >> but since that's already been opened, what about CVS vs >> Subversion? 8) >> Some of the scripting for this sort of thing might be easer in >> Subversion. Though maybe something like Git would fit the developer >> model better (more support for distributed development). >> >> On 6/14/07, Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> I'm just wondering if anyone passes their modules through >>>> perltidy in >>>> order for them to have the same look/feel? If so, do you have a >>>> .perltidyrc file? Also, is it worth running the Bioperl modules >>>> through it? >>> >>> I don't use it, but I was contemplating the same thing. Chris >>> uses it >>> from time to time and I think we have a similar taste in style. >>> >>> But we'd have to hammer something out that was agreeable to >>> everyone. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Thu Jun 14 15:01:27 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:01:27 -0700 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: <46717D5B.5040108@sheffield.ac.uk> References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: cvs update | grep '^M' On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > Not sure if I'm being dense or if it's because I've been working with > svn recently, but - how do I get a list of files that are different in > my working copy compared to the repository? > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Thu Jun 14 15:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 14:20:46 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > > On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > >> To chip in on this, I only use perltidy when I need to clean bioperl >> code up for debugging (particularly if blocks are hard to see) and >> just use the defaults. I agree it would be nice to have everything >> tidied up but it'll definitely need to be a consensus config file. >> > > Can we do any sort of massive conversion at some logical timepoint. > Probably after a branch release or something? Because it basically > means we're going to have differences on nearly every line which is > going to make diff-ing difficult when debugging old/new versions. > Maybe it is not a problem because we aren't introducing and new bugs! I agree; if we intend on doing this it should be all at once, maybe on a branch dedicated to ensure that code changes don't tank tests (they shouldn't but one never knows). We would then need a script up- and-running that tidies everything up prior to commits (though what happens if perltidy tanks?...). Sendu, up for it? >> About svn, I like the idea of eventually migrating to using it over >> CVS (I think BioPython and BioJava have plans to but I'm not sure) >> but I don't really know enough to say how feasible/difficult the >> migration path would be. Anyone know? >> > > It's doable but non-trivial. cvs2svn (python gah!) script exists to > help in this. There are pros and cons to converting. There is a > fair amount of documentation and other pointers out there that point > to the CVS server for getting latest code so we'd need to think about > whether we'd support some sort of backwards compatible SVN -> CVS for > read-only or what. > > Mostly it will need someone to lead the charge - I made a go at doing > it in the winter, but I really don't have the SVN-foo to make this > work. We'd need someone with SVN experience to step up and help. > You can always try and we can play with the converted repository for > a while without making it the new code base. > > -j Stepped into that one, didn't I! I'll look into how much effort is involved and try getting something going in the next month or two, maybe sooner if time permits. I'm lacking on SVN-foo as well but it might be worth looking into. chris From arareko at campus.iztacala.unam.mx Thu Jun 14 15:50:39 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 14:50:39 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> About svn, I like the idea of eventually migrating to using it over >>> CVS (I think BioPython and BioJava have plans to but I'm not sure) >>> but I don't really know enough to say how feasible/difficult the >>> migration path would be. Anyone know? >>> >> It's doable but non-trivial. cvs2svn (python gah!) script exists to >> help in this. There are pros and cons to converting. There is a >> fair amount of documentation and other pointers out there that point >> to the CVS server for getting latest code so we'd need to think about >> whether we'd support some sort of backwards compatible SVN -> CVS for >> read-only or what. >> >> Mostly it will need someone to lead the charge - I made a go at doing >> it in the winter, but I really don't have the SVN-foo to make this >> work. We'd need someone with SVN experience to step up and help. >> You can always try and we can play with the converted repository for >> a while without making it the new code base. >> >> -j > > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. > > chris > Chris D has worked with CVS-SVN transitioning for other projects, maybe he can shed some light on this. Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From sac at bioperl.org Thu Jun 14 17:33:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Thu, 14 Jun 2007 14:33:39 -0700 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> References: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com> This issue was discussed recently here. Check out this thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048 Some of the tools mentioned in the FAQ item Chris mentioned do not report where the match occurred, only that a match occurred (String::Approx, agrep), though some do report do report match locations (fuzznuc, fuzzprot; not sure about TFBS). My Bio::Tools::SeqPattern module does not even perform any matches, it just encapsulates a regular expression for a nuc or protein motif and knows how to handle ambiguity code expansion and reverse complementing. The idea is that you can use this to convert a biological sequence motif into a string suitable for use in a perl regex. Adding a match() method to this module would be handy. There an example script for it in examples/tools of the distro (which, btw references an obsolete module, so it won't run as is -- I'll fix). Steve On 6/13/07, Chris Fields wrote: > This is answered in the FAQ (sorry if the URL wraps, but we don't > like tinyurls): > > http://www.bioperl.org/wiki/ > FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. > 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F > > chris > > On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > > > Hello, > > > > I have a simple problem, I'm trying to search a genome sequence for > > a motif, > > I then want to output a BED file to display all the locations of > > this motif > > on the UCSC Genome Browser. I could not find a script to do this, > > so I > > started to write my own. I'm new to perl and my code below was my > > attempt > > to read the sequence string and output the index bp of the start of > > each > > motif. With this I could build the BED file myself, which requires > > start > > and finish base pairs. > > > > For the first motif I can output the start index, but when I try > > and read > > the next one off the sequence it does not work. Instead I just get an > > output of a list of 1's. I realise that this is more a request for > > some > > simple perl help, but any help much appreciated. > > > > Best wishes, > > John > > > > > > $seq_object = read_sequence > > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > > my FASTA file into a seq object. > > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > > # search $sequence_as_a_string string for motif AAA as example > > # if found, return the index that it is found at > > > > while ($sequence_as_a_string =~ m/AAA/g) { > > print "Found '$&'. Next attempt at character " . > > pos($sequence_as_a_string)+1 . "\n"; > > } > > > > > > > > -- > > John Cumbers, Graduate Student > > Biology and Medicine > > Brown University, Box G-W > > Providence, Rhode Island, 02912, USA > > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > > UK to USA: 0207 617 7824 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Thu Jun 14 19:04:11 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 19:04:11 -0400 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net> Actually, that will update your repository. If you just wanted to take a peek you would use cvs status: $ cvs status | grep 'Locally Modified' -hilmar On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote: > cvs update | grep '^M' > > On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > >> Not sure if I'm being dense or if it's because I've been working with >> svn recently, but - how do I get a list of files that are >> different in >> my working copy compared to the repository? >> >> Cheers >> Nath >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From mmokrejs at ribosome.natur.cuni.cz Fri Jun 15 03:28:17 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 15 Jun 2007 09:28:17 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <46723F91.60501@ribosome.natur.cuni.cz> Chris Fields wrote: > Is 99.gb supposed to be a GenBank file? And you're loading it into Yes, it was attached to the email. ;) > embl2picture (which I assume takes EMBL format files)? Without example > code we can easily make the wrong assumptions (i.e. that this is user > error and not a BioPerl problem). use constant USAGE =>< Render a GenBank/EMBL entry into drawable form. Return as a GIF or PNG image on standard output. File must be in embl, genbank, or another SeqIO- recognized format. Only the first entry will be rendered. Example to try: embl2picture.pl factor7.embl | display - END > > Also, I don't believe the feature plotting scripts plot circular > chromosomes/plasmids. If you want this functionality you'll have to > code it for yourself. That's a pitty it does not, but at least if someone could improve the docs. ;) Unfortunately I don't have the time to rewrite the code myself now, I need a working, standalone, already available tool. :( M. > > chris > > On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > >> Martin MOKREJ? wrote: >> >>>> Also, there is a *huge* amount of documentation and examples on the >>>> BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> >>> ? ;-) >> >> $ perl embl2picture.pl ~/99.gb | display - >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature >> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature >> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature >> Bio::Location::Simple=HASH(0x893e720): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature >> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> $ >> >> The plasmid is a circular DNA, why is the diagram in linear? ;-) >> >> Martin >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From dhoworth at mrc-lmb.cam.ac.uk Fri Jun 15 04:59:09 2007 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Fri, 15 Jun 2007 09:59:09 +0100 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk> Martin MOKREJ? wrote: >>> Also, there is a *huge* amount of documentation and examples on >>> the BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> You mean >> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - Error returned while > evaluating value of 'description' option for glyph > Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. Hmm an error at line 141 of a 69 line script? Methinks you're not actually running the script that's presented on the wiki page you quoted. I cut-and-pasted the script and your file and it worked for me (at least, it produced an image, along with a bunch of OOPS lines) HTH, Dave From n.haigh at sheffield.ac.uk Fri Jun 15 06:21:38 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:21:38 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726832.7080601@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a VPt4tEPLW2J+BiKnN3B8aV8= =c+9z -----END PGP SIGNATURE----- From bix at sendu.me.uk Fri Jun 15 06:07:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:07:04 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <467264C8.4020202@sendu.me.uk> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> To chip in on this, I only use perltidy when I need to clean bioperl >>> code up for debugging (particularly if blocks are hard to see) and >>> just use the defaults. I agree it would be nice to have everything >>> tidied up but it'll definitely need to be a consensus config file. >>> >> Can we do any sort of massive conversion at some logical timepoint. >> Probably after a branch release or something? Because it basically >> means we're going to have differences on nearly every line which is >> going to make diff-ing difficult when debugging old/new versions. >> Maybe it is not a problem because we aren't introducing and new bugs! Sorry, can you clarify the problem you envisage? And why would making a branch release help? > I agree; if we intend on doing this it should be all at once, maybe > on a branch dedicated to ensure that code changes don't tank tests > (they shouldn't but one never knows). We would then need a script up- > and-running that tidies everything up prior to commits (though what > happens if perltidy tanks?...). > > Sendu, up for it? If its going to be difficult and a hassle, for such an unnecessary thing I'm not sure its worth it. There are more pressing things to be done for Bioperl. If I can just run perltidy on the entire package and commit, I'd do it. If that's not appropriate, I won't. >>> About svn [snip] > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. I'd put this in the unnecessary-but-nice category as well. If it will be as easy as my ->new change, go ahead. If not, there are more pressing matters (POD fixing, test script updating and finishing...). From n.haigh at sheffield.ac.uk Fri Jun 15 06:35:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:35:40 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726B7C.7070902@sheffield.ac.uk> I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath From bix at sendu.me.uk Fri Jun 15 06:45:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:45:48 +0100 Subject: [Bioperl-l] Installation using --install_base In-Reply-To: <46726832.7080601@sheffield.ac.uk> References: <46726832.7080601@sheffield.ac.uk> Message-ID: <46726DDC.8090202@sendu.me.uk> Nathan S. Haigh wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I'm setting up a new installation of Debian 4.0 at home and though I'd > try to install BioPerl as a normal user rather than root. So in CPAN > options I set the --install_base to /home/username/perl and set PERL5LIB > to point to the same place. > > Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root > user and ask to install all optional modules, it tries to install them > through CPAN - however it seems to fail because some dependencies don't > seem to want to install in a user directory. > > Has anyone else found this or might I be doing something wrong? You'll need to configure CPAN to install into your user directory. Upgrade to the latest version, then go read the docs on the various configurable options. I thought I at least mentioned this in the Bioperl INSTALL doc. If not, can someone come up with a concise clarification? From sdavis2 at mail.nih.gov Fri Jun 15 06:56:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 06:56:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <46727048.3080904@mail.nih.gov> Sendu Bala wrote: > If its going to be difficult and a hassle, for such an unnecessary thing > I'm not sure its worth it. There are more pressing things to be done for > Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do it. > If that's not appropriate, I won't. I agree with the sentiment noted above. I'm a bit of an outsider here, but bioperl is a collaborative project. Not everyone has the same sentiments about what "correct" style means. As a programmer, I really wouldn't want significant changes on the style of my code. And perl happily puts up with many styles. I would say leave things as they are--let the individual programmers choose. It reduces the amount of work of questionable importance and allows the coding style freedom that perl supports. Just my $.02. Sean From cjfields at uiuc.edu Fri Jun 15 10:05:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:05:07 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> Message-ID: On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote: > Chris Fields wrote: >> Is 99.gb supposed to be a GenBank file? And you're loading it into > > Yes, it was attached to the email. ;) Sorry about that. I notice that '.' was added, but the spacing seemed off. I think bioperl catches that fine but it's something Wayne should consider. >> embl2picture (which I assume takes EMBL format files)? Without >> example >> code we can easily make the wrong assumptions (i.e. that this is user >> error and not a BioPerl problem). > > use constant USAGE =>< Usage: $0 > Render a GenBank/EMBL entry into drawable form. > Return as a GIF or PNG image on standard output. > > File must be in embl, genbank, or another SeqIO- > recognized format. Only the first entry will be > rendered. > > Example to try: > embl2picture.pl factor7.embl | display - > > END Horribly named script (should be seq2picture, since it converts both gb/embl). The use of 'all_tags' makes me think the script version you are using is old, as those methods have long since been renamed. Dave has it working though, so maybe your version has been updated? The 'use of initialized data in' errors are probably from inclusion of mandatory fields with no data or '.'. >> Also, I don't believe the feature plotting scripts plot circular >> chromosomes/plasmids. If you want this functionality you'll have to >> code it for yourself. > > That's a pitty it does not, but at least if someone could improve > the docs. ;) > Unfortunately I don't have the time to rewrite the code myself now, > I need a working, standalone, already available tool. :( > M. As I said, unless someone shows interest and codes it just won't get done. We have had very little interest in this, either b/c there are tools already out there to do this very thing (multitudes of plasmid drawing programs, some free like ApE) or that nobody's bothered to write it up. chris From cjfields at uiuc.edu Fri Jun 15 10:22:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:22:23 -0500 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <46727048.3080904@mail.nih.gov> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing >> I'm not sure its worth it. There are more pressing things to be >> done for >> Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd >> do it. >> If that's not appropriate, I won't. > > I agree with the sentiment noted above. I'm a bit of an outsider > here, > but bioperl is a collaborative project. Not everyone has the same > sentiments about what "correct" style means. As a programmer, I > really > wouldn't want significant changes on the style of my code. And perl > happily puts up with many styles. I would say leave things as they > are--let the individual programmers choose. It reduces the amount of > work of questionable importance and allows the coding style freedom > that > perl supports. > > Just my $.02. > > Sean I tend to run it on modules that need some reformatting (SearchIO::blast comes to mind). I believe you're correct when this comes down to programming style, but I think this echoes a sentiment (frustration, perhaps) that some of us have with long-term maintenance of said code. Maybe a compromise: include a copy of .perltidyrc with the distribution that goes by what a consensus wants or by the general rules laid out in Perl Best Practices (spaced settings, use of spaces over tabs, etc). Conversion would be encouraged but voluntary, with the caveat that if someone needs to clean up code down the road (bug fixes, enhancements, etc) and if the original author isn't able to add it in themselves, it could be perltidy'd in order to help the developer (locate and fix the issue)|(add relevant enhancement where needed). chris From cjfields at uiuc.edu Fri Jun 15 10:56:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:56:23 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> ... >>> Can we do any sort of massive conversion at some logical timepoint. >>> Probably after a branch release or something? Because it basically >>> means we're going to have differences on nearly every line which is >>> going to make diff-ing difficult when debugging old/new versions. >>> Maybe it is not a problem because we aren't introducing and new >>> bugs! > > Sorry, can you clarify the problem you envisage? And why would > making a branch release help? Maybe the worry is that mass conversion in such a large codebase could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o trying? >> I agree; if we intend on doing this it should be all at once, >> maybe on a branch dedicated to ensure that code changes don't >> tank tests (they shouldn't but one never knows). We would then >> need a script up- and-running that tidies everything up prior to >> commits (though what happens if perltidy tanks?...). >> Sendu, up for it? > > If its going to be difficult and a hassle, for such an unnecessary > thing I'm not sure its worth it. There are more pressing things to > be done for Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do > it. If that's not appropriate, I won't. The choices aren't necessarily all or nothing. What about voluntary, recommended use of a perltidy config file included with the distribution, with additional 'caveats'? See my response to Sean. >>>> About svn > [snip] >> Stepped into that one, didn't I! I'll look into how much effort >> is involved and try getting something going in the next month or >> two, maybe sooner if time permits. I'm lacking on SVN-foo as >> well but it might be worth looking into. > > I'd put this in the unnecessary-but-nice category as well. If it > will be as easy as my ->new change, go ahead. If not, there are > more pressing matters (POD fixing, test script updating and > finishing...). A few other open-bio projects have actively discussed a CVS->SVN migration (BioRuby and I think BioPython, though the latter could be wrong). As I said, "it might be worth looking into" to weigh the pros/cons, get others opinions from others who have made the transition, etc. We could, as Jason suggested, even set up a tester SVN w/o making it the default codebase (lock it off to a few testers, have CVS commits automatically/manually carry over to SVN, etc). I agree with you that it's not feasible to switch over prior to a release and that there are more pressing issues, but it doesn't hurt having an open discussion about it. chris From sdavis2 at mail.nih.gov Fri Jun 15 11:15:57 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 11:15:57 -0400 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD2D.2090001@mail.nih.gov> Chris Fields wrote: > > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary thing >>> I'm not sure its worth it. There are more pressing things to be done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do it. >>> If that's not appropriate, I won't. >> >> I agree with the sentiment noted above. I'm a bit of an outsider here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting (SearchIO::blast > comes to mind). I believe you're correct when this comes down to > programming style, but I think this echoes a sentiment (frustration, > perhaps) that some of us have with long-term maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the distribution > that goes by what a consensus wants or by the general rules laid out in > Perl Best Practices (spaced settings, use of spaces over tabs, etc). > Conversion would be encouraged but voluntary, with the caveat that if > someone needs to clean up code down the road (bug fixes, enhancements, > etc) and if the original author isn't able to add it in themselves, it > could be perltidy'd in order to help the developer (locate and fix the > issue)|(add relevant enhancement where needed). Don't get me wrong--I think whatever makes bioperl a better, more maintainable beast should be what is done. The bioperl gurus should absolutely do what is best for them for code maintainability. Sean From n.haigh at sheffield.ac.uk Fri Jun 15 11:17:15 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 16:17:15 +0100 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD7B.4050109@sheffield.ac.uk> Chris Fields wrote: > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing >>> I'm not sure its worth it. There are more pressing things to be >>> done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd >>> do it. >>> If that's not appropriate, I won't. >> I agree with the sentiment noted above. I'm a bit of an outsider >> here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I >> really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom >> that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting > (SearchIO::blast comes to mind). I believe you're correct when this > comes down to programming style, but I think this echoes a sentiment > (frustration, perhaps) that some of us have with long-term > maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the > distribution that goes by what a consensus wants or by the general > rules laid out in Perl Best Practices (spaced settings, use of spaces > over tabs, etc). RE spaces, tabs etc - how well is the different coding styles handled for displaying in html and via the online browsable cvs? Conversion would be encouraged but voluntary, with > the caveat that if someone needs to clean up code down the road (bug > fixes, enhancements, etc) and if the original author isn't able to > add it in themselves, it could be perltidy'd in order to help the > developer (locate and fix the issue)|(add relevant enhancement where > needed). > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnsonm at gmail.com Fri Jun 15 15:37:26 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Fri, 15 Jun 2007 14:37:26 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: Patches waiting in Bugzilla (Bug #2299). Changes: -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for prokaryotic reports (Glimmer2/Glimmer3) -Bio::Tools::Glimmer now produces features with Fuzzy or Split locations as appropriate (partial or circular/wraparound predictions) -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out sequence lengths -Bio::Tools::Run::Glimmer passes along the sequence length to Bio::Tools::Glimmer for Glimmer2 I should probably modify Bio::Tools::Genemark to use Bio::SeqFeature::Generic features for prokaryotic reports, to be consistent, but this is more likely to surprise people. If nobody screams about the change to Bio::Tools::Glimmer, I'll do it at some point. On 5/21/07, Chris Fields wrote: > > On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: > > >> glimmer2/3 both assume the genome is circular by default (I'm > >> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to > >> the Glimmer3 release notes the detail file has the information in the > >> header; from the Glimmer3 data used for tests: > > > > You beat me to the reply Chris - yes, Glimmer2/3 assume circular > > chromosome by default. I had forgotten about this in earlier > > discussions of the new Glimmer parsers as I normally run it in > > --linear / -L mode (even if I know it is circular) because it is > > easier to handle, and our sequencer/assembler team usually gets the > > origin of replication right. > > > >> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA > >> Glimmer3.icm Glimmer3 > > > > I did a double-take here - that's the path to my Glimmer3 > > installation! It took me a couple of minutes to realise that you got > > it from the bioperl test data I created. D'oh! :-) > > Yep, I forgot about that! > > >> There are options available for glimmer3 (-L, -X) that specify a > >> linear sequence or allow ORFs to extend past the end of the sequence > >> analyzed (the latter assumes a linear sequence). > > > > If the -L mode should produce Bio::Location::Split objects, I guess if > > -X is used > > it should produce Bio::Location::Fuzzy objects too... > > > > --Torsten > > True, didn't think about that one. Def. something to consider adding > in. > > chris > > > From cjfields at uiuc.edu Fri Jun 15 16:55:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 15:55:06 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: I'll try getting to that in tonight. Been pretty tied up lately... chris On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote: > Patches waiting in Bugzilla (Bug #2299). Changes: > > -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for > prokaryotic reports (Glimmer2/Glimmer3) > -Bio::Tools::Glimmer now produces features with Fuzzy or Split > locations as appropriate (partial or circular/wraparound predictions) > -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out > sequence lengths > -Bio::Tools::Run::Glimmer passes along the sequence length to > Bio::Tools::Glimmer for Glimmer2 > > I should probably modify Bio::Tools::Genemark to use > Bio::SeqFeature::Generic features for prokaryotic reports, to be > consistent, but this is more likely to surprise people. If nobody > screams about the change to Bio::Tools::Glimmer, I'll do it at some > point. > > On 5/21/07, Chris Fields wrote: >> >> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: >> >>>> glimmer2/3 both assume the genome is circular by default (I'm >>>> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to >>>> the Glimmer3 release notes the detail file has the information >>>> in the >>>> header; from the Glimmer3 data used for tests: >>> >>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular >>> chromosome by default. I had forgotten about this in earlier >>> discussions of the new Glimmer parsers as I normally run it in >>> --linear / -L mode (even if I know it is circular) because it is >>> easier to handle, and our sequencer/assembler team usually gets the >>> origin of replication right. >>> >>>> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ >>>> BCTDNA >>>> Glimmer3.icm Glimmer3 >>> >>> I did a double-take here - that's the path to my Glimmer3 >>> installation! It took me a couple of minutes to realise that you got >>> it from the bioperl test data I created. D'oh! :-) >> >> Yep, I forgot about that! >> >>>> There are options available for glimmer3 (-L, -X) that specify a >>>> linear sequence or allow ORFs to extend past the end of the >>>> sequence >>>> analyzed (the latter assumes a linear sequence). >>> >>> If the -L mode should produce Bio::Location::Split objects, I >>> guess if >>> -X is used >>> it should produce Bio::Location::Fuzzy objects too... >>> >>> --Torsten >> >> True, didn't think about that one. Def. something to consider adding >> in. >> >> chris >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rvos at interchange.ubc.ca Fri Jun 15 17:08:17 2007 From: rvos at interchange.ubc.ca (rvos) Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Hi, I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. Rutger -----Original Message----- > Date: Fri Jun 15 07:56:23 PDT 2007 > From: "Chris Fields" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sendu Bala" > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > >>>> ... > >>> Can we do any sort of massive conversion at some logical timepoint. > >>> Probably after a branch release or something? Because it basically > >>> means we're going to have differences on nearly every line which is > >>> going to make diff-ing difficult when debugging old/new versions. > >>> Maybe it is not a problem because we aren't introducing and new > >>> bugs! > > > > Sorry, can you clarify the problem you envisage? And why would > > making a branch release help? > > Maybe the worry is that mass conversion in such a large codebase > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > trying? > > >> I agree; if we intend on doing this it should be all at once, > >> maybe on a branch dedicated to ensure that code changes don't > >> tank tests (they shouldn't but one never knows). We would then > >> need a script up- and-running that tidies everything up prior to > >> commits (though what happens if perltidy tanks?...). > >> Sendu, up for it? > > > > If its going to be difficult and a hassle, for such an unnecessary > > thing I'm not sure its worth it. There are more pressing things to > > be done for Bioperl. > > > > If I can just run perltidy on the entire package and commit, I'd do > > it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? See my response to Sean. > > >>>> About svn > > [snip] > >> Stepped into that one, didn't I! I'll look into how much effort > >> is involved and try getting something going in the next month or > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >> well but it might be worth looking into. > > > > I'd put this in the unnecessary-but-nice category as well. If it > > will be as easy as my ->new change, go ahead. If not, there are > > more pressing matters (POD fixing, test script updating and > > finishing...). > > A few other open-bio projects have actively discussed a CVS->SVN > migration (BioRuby and I think BioPython, though the latter could be > wrong). As I said, "it might be worth looking into" to weigh the > pros/cons, get others opinions from others who have made the > transition, etc. We could, as Jason suggested, even set up a tester > SVN w/o making it the default codebase (lock it off to a few testers, > have CVS commits automatically/manually carry over to SVN, etc). > > I agree with you that it's not feasible to switch over prior to a > release and that there are more pressing issues, but it doesn't hurt > having an open discussion about it. > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From spiros at lokku.com Fri Jun 15 17:40:32 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Fri, 15 Jun 2007 22:40:32 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: On 6/15/07, rvos wrote: > Hi, > > I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. > > Rutger > I second that, SVN seems like the reasonable choice. I would be more than happy to help out as well. Spiros > > -----Original Message----- > > > Date: Fri Jun 15 07:56:23 PDT 2007 > > From: "Chris Fields" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sendu Bala" > > > > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > > > >>>> ... > > >>> Can we do any sort of massive conversion at some logical timepoint. > > >>> Probably after a branch release or something? Because it basically > > >>> means we're going to have differences on nearly every line which is > > >>> going to make diff-ing difficult when debugging old/new versions. > > >>> Maybe it is not a problem because we aren't introducing and new > > >>> bugs! > > > > > > Sorry, can you clarify the problem you envisage? And why would > > > making a branch release help? > > > > Maybe the worry is that mass conversion in such a large codebase > > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > > trying? > > > > >> I agree; if we intend on doing this it should be all at once, > > >> maybe on a branch dedicated to ensure that code changes don't > > >> tank tests (they shouldn't but one never knows). We would then > > >> need a script up- and-running that tidies everything up prior to > > >> commits (though what happens if perltidy tanks?...). > > >> Sendu, up for it? > > > > > > If its going to be difficult and a hassle, for such an unnecessary > > > thing I'm not sure its worth it. There are more pressing things to > > > be done for Bioperl. > > > > > > If I can just run perltidy on the entire package and commit, I'd do > > > it. If that's not appropriate, I won't. > > > > The choices aren't necessarily all or nothing. What about voluntary, > > recommended use of a perltidy config file included with the > > distribution, with additional 'caveats'? See my response to Sean. > > > > >>>> About svn > > > [snip] > > >> Stepped into that one, didn't I! I'll look into how much effort > > >> is involved and try getting something going in the next month or > > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > > >> well but it might be worth looking into. > > > > > > I'd put this in the unnecessary-but-nice category as well. If it > > > will be as easy as my ->new change, go ahead. If not, there are > > > more pressing matters (POD fixing, test script updating and > > > finishing...). > > > > A few other open-bio projects have actively discussed a CVS->SVN > > migration (BioRuby and I think BioPython, though the latter could be > > wrong). As I said, "it might be worth looking into" to weigh the > > pros/cons, get others opinions from others who have made the > > transition, etc. We could, as Jason suggested, even set up a tester > > SVN w/o making it the default codebase (lock it off to a few testers, > > have CVS commits automatically/manually carry over to SVN, etc). > > > > I agree with you that it's not feasible to switch over prior to a > > release and that there are more pressing issues, but it doesn't hurt > > having an open discussion about it. > > > > chris > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Fri Jun 15 18:10:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 18:10:25 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> So should we set up a sandbox svn repository and those who would like to help out - take shots at migrating bioperl (any current cvs snapshot will do) to svn - you document what you find yourself having to do in trying to make it work - you report back when you think you have a working repository - we all get a defined amount of time to test to our hearts' content, say 2 weeks - you fix issues that were encountered - report back when done, followed by retesting for, say 1 week - iterate previous 2 steps until no issues and no objections to migration - two more weeks of warning period to all developers to commit all outstanding changes, or reapply them to a future svn checkout - pull the trigger by locking down cvs, applying the migration as worked out before, and announcing that BioPerl is now on svn - get free beer at next BOSC (I'll pay if no one else does) This may not be precisely the plan that needs to be executed, but it's probably somewhere along those lines. If there are volunteers who would like to spearhead this, then power to you - I think everyone is in favor and the advantages of svn don't need to be debated. The only reason it hasn't happened yet is because no one has stepped forward who would have the energy. I'm sure ChrisD will gladly create the svn sandbox if we have volunteers lined up to get going. -hilmar On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > On 6/15/07, rvos wrote: >> Hi, >> >> I would very much prefer it if bioperl moved to svn. I'm >> considering merging Bio::Phylo (to the extent that that's possible/ >> practical) with bioperl and move it to an OBF repository, but I'd >> rather not go back to CVS. >> >> Rutger >> > > I second that, SVN seems like the reasonable choice. I would be more > than happy to help out as well. > > Spiros > >> >> -----Original Message----- >> >>> Date: Fri Jun 15 07:56:23 PDT 2007 >>> From: "Chris Fields" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sendu Bala" >>> >>> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> >>>>>>> ... >>>>>> Can we do any sort of massive conversion at some logical >>>>>> timepoint. >>>>>> Probably after a branch release or something? Because it >>>>>> basically >>>>>> means we're going to have differences on nearly every line >>>>>> which is >>>>>> going to make diff-ing difficult when debugging old/new versions. >>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>> bugs! >>>> >>>> Sorry, can you clarify the problem you envisage? And why would >>>> making a branch release help? >>> >>> Maybe the worry is that mass conversion in such a large codebase >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>> w/o >>> trying? >>> >>>>> I agree; if we intend on doing this it should be all at once, >>>>> maybe on a branch dedicated to ensure that code changes don't >>>>> tank tests (they shouldn't but one never knows). We would then >>>>> need a script up- and-running that tidies everything up prior to >>>>> commits (though what happens if perltidy tanks?...). >>>>> Sendu, up for it? >>>> >>>> If its going to be difficult and a hassle, for such an unnecessary >>>> thing I'm not sure its worth it. There are more pressing things to >>>> be done for Bioperl. >>>> >>>> If I can just run perltidy on the entire package and commit, I'd do >>>> it. If that's not appropriate, I won't. >>> >>> The choices aren't necessarily all or nothing. What about >>> voluntary, >>> recommended use of a perltidy config file included with the >>> distribution, with additional 'caveats'? See my response to Sean. >>> >>>>>>> About svn >>>> [snip] >>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>> is involved and try getting something going in the next month or >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>> well but it might be worth looking into. >>>> >>>> I'd put this in the unnecessary-but-nice category as well. If it >>>> will be as easy as my ->new change, go ahead. If not, there are >>>> more pressing matters (POD fixing, test script updating and >>>> finishing...). >>> >>> A few other open-bio projects have actively discussed a CVS->SVN >>> migration (BioRuby and I think BioPython, though the latter could be >>> wrong). As I said, "it might be worth looking into" to weigh the >>> pros/cons, get others opinions from others who have made the >>> transition, etc. We could, as Jason suggested, even set up a tester >>> SVN w/o making it the default codebase (lock it off to a few >>> testers, >>> have CVS commits automatically/manually carry over to SVN, etc). >>> >>> I agree with you that it's not feasible to switch over prior to a >>> release and that there are more pressing issues, but it doesn't hurt >>> having an open discussion about it. >>> >>> chris >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Fri Jun 15 18:23:15 2007 From: jason at bioperl.org (Jason Stajich) Date: Fri, 15 Jun 2007 15:23:15 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: Sounds like a plan, I'll be curious to see if we can still get keep anonymous CVS working as I'd like to not have to pull the plug on that. There are some threads out on the web about how to do this with a commit rule on SVN. Also, can someone who is close enough to all the SVN benefits please elaborate how it is going to help _this_ project? Perhaps you would be willing to put a few words up -- like on (a to be created): http://bioperl.org/wiki/BioPerl:Version_control_changeover This way if anonymous CVS is broken and/or developers who haven't been paying attention come back to commit code ask why things changed we don't have to compose long emails... =) -jason On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From sheris at eps.berkeley.edu Fri Jun 15 18:58:12 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 15:58:12 -0700 Subject: [Bioperl-l] seq doesn't validate error Message-ID: <200706151558.12911.sheris@eps.berkeley.edu> Hi, I'm getting an error as follows when I try to reverse complement a sequence string stored in a hash of arrays. The storage code is: $nstarthash{$key} = [$sortchecks[0], join("", @nseq), join("",@{$seqhash{$key}})]; the sequence of interest is the element at index 1. Later, I try to retrieve this string for a subset of keys so I can reverse complement it based on input from another hash (%complement): my %revcomphash = map { my $read = $_; grep $complement{$read} eq 'C', %complement; {$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};} keys(%nstarthash); I get the following warning (long sequence edited for clarity): -- -------------------- WARNING --------------------- MSG: seq doesn't validate, mismatch is 1 --------------------------------------------------- ------------- EXCEPTION ------------- MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] which does not look healthy STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK toplevel ../quality_wrapper.pl:103 I cannot find any non-allowed characters in the sequence, and the de-referencing appears to work correctly. Can anyone help me? I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a Mepis 6.5 system. Thanks Sheri --------------------------------------------------------------------- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From Kevin.M.Brown at asu.edu Fri Jun 15 19:11:34 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Fri, 15 Jun 2007 16:11:34 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> > I'm getting an error as follows when I try to reverse > complement a sequence string stored in a hash of arrays. The > storage code is: > > $nstarthash{$key} = [$sortchecks[0], join("", > @nseq), > join("",@{$seqhash{$key}})]; > > the sequence of interest is the element at index 1. > > Later, I try to retrieve this string for a subset of keys so > I can reverse complement it based on input from another hash > (%complement): > > my %revcomphash = map { my $read = $_; > grep $complement{$read} eq 'C', %complement; > {$_, (Bio::Seq->new(-seq > =>$nstarthash{$_}[1]))->revcom->seq()};} > keys(%nstarthash); > > > I get the following warning (long sequence edited for clarity): > > -- -------------------- WARNING --------------------- > MSG: seq doesn't validate, mismatch is 1 > --------------------------------------------------- > > ------------- EXCEPTION ------------- > MSG: Attempting to set the sequence to > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > which does not look healthy > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > toplevel ../quality_wrapper.pl:103 > > I cannot find any non-allowed characters in the sequence, and > the de-referencing appears to work correctly. Can anyone help me? > I'm using the latest Bioperl installation (1.5.2) with > ActivePerl5.8 on a Mepis 6.5 system. Try telling the Bio::Seq object what alphabet to use when creating it. I tend to create them like: Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') From sheris at eps.berkeley.edu Fri Jun 15 19:53:04 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 16:53:04 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> Message-ID: <200706151653.04135.sheris@eps.berkeley.edu> Thanks for the suggestion, but that still gives the same error as before. On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: > > I'm getting an error as follows when I try to reverse > > complement a sequence string stored in a hash of arrays. The > > storage code is: > > > > $nstarthash{$key} = [$sortchecks[0], join("", > > @nseq), > > join("",@{$seqhash{$key}})]; > > > > the sequence of interest is the element at index 1. > > > > Later, I try to retrieve this string for a subset of keys so > > I can reverse complement it based on input from another hash > > (%complement): > > > > my %revcomphash = map { my $read = $_; > > grep $complement{$read} eq 'C', %complement; > > {$_, (Bio::Seq->new(-seq > > =>$nstarthash{$_}[1]))->revcom->seq()};} > > keys(%nstarthash); > > > > > > I get the following warning (long sequence edited for clarity): > > > > -- -------------------- WARNING --------------------- > > MSG: seq doesn't validate, mismatch is 1 > > --------------------------------------------------- > > > > ------------- EXCEPTION ------------- > > MSG: Attempting to set the sequence to > > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > > which does not look healthy > > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > > toplevel ../quality_wrapper.pl:103 > > > > I cannot find any non-allowed characters in the sequence, and > > the de-referencing appears to work correctly. Can anyone help me? > > I'm using the latest Bioperl installation (1.5.2) with > > ActivePerl5.8 on a Mepis 6.5 system. > > Try telling the Bio::Seq object what alphabet to use when creating it. > I tend to create them like: > > Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') -- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From hlapp at gmx.net Fri Jun 15 21:27:42 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 21:27:42 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: Could you post a ticket to the helpdesk: support at open-bio.org. -hilmar On Jun 15, 2007, at 9:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Fri Jun 15 21:08:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 15 Jun 2007 21:08:32 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <18035.14352.963113.473274@almost.alerce.com> Hilmar Lapp writes: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn Free Beer, huh? Do you deliver? Can you package up a tarball of the cvs repository (bzip or gzip would save some time) itself? thanks! g. From cjfields at uiuc.edu Fri Jun 15 21:42:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:42:05 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> The browsable CVS has a 'Download tarball' link if that helps. http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? cvsroot=bioperl chris On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. From cjfields at uiuc.edu Fri Jun 15 21:50:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:50:09 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> I'll help out to the extent I can w/o having the SVN know-how. We need (as Jason points out) someone who can detail the benefits and maybe keep an updated journal on the wiki. I believe at least one or two of the other Bio* contemplated moving over to SVN, which may be worth checking out. chris On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Jun 15 22:12:55 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 22:12:55 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> I think he meant the cvs repository itself, containing all the change data. -hilmar On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > The browsable CVS has a 'Download tarball' link if that helps. > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? > cvsroot=bioperl > > chris > > On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > >> Hilmar Lapp writes: >>> So should we set up a sandbox svn repository and those who would >>> like >>> to help out >>> >>> - take shots at migrating bioperl (any current cvs snapshot will do) >>> to svn >> >> Free Beer, huh? Do you deliver? >> >> Can you package up a tarball of the cvs repository (bzip or gzip >> would >> save some time) itself? >> >> thanks! >> >> g. > > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Jun 15 22:37:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 21:37:55 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: Ah, got it. Sorry. George, planning on taking this up? chris On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote: > I think he meant the cvs repository itself, containing all the > change data. -hilmar > > On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > >> The browsable CVS has a 'Download tarball' link if that helps. >> >> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? >> cvsroot=bioperl >> >> chris >> >> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: >> >>> Hilmar Lapp writes: >>>> So should we set up a sandbox svn repository and those who would >>>> like >>>> to help out >>>> >>>> - take shots at migrating bioperl (any current cvs snapshot will >>>> do) >>>> to svn >>> >>> Free Beer, huh? Do you deliver? >>> >>> Can you package up a tarball of the cvs repository (bzip or gzip >>> would >>> save some time) itself? >>> >>> thanks! >>> >>> g. >> >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sat Jun 16 04:20:57 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 16 Jun 2007 09:20:57 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <46739D69.4090204@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Hilmar Lapp writes: > > So should we set up a sandbox svn repository and those who would like > > to help out > > > > - take shots at migrating bioperl (any current cvs snapshot will do) > > to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds like George might know what he's doing! I have a question about setting up svn access. I believe access can be done in several ways, over webdav, over ssh and probably others too. Do you have any knowledge about the benefits of one over the other? I suppose I'm thinking of what to implement to allow anonymous read access for users and authenticated access for developers. Nath p.s. if you need any monkeys to do some work I'm happy to help out as much as possible. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9 Lb9NUEe4dkCakQ+Gc7Py98A= =BG9m -----END PGP SIGNATURE----- From rvos at interchange.ubc.ca Sat Jun 16 06:37:11 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca> I can volunteer some time to help out with this. Rutger -----Original Message----- > Date: Fri Jun 15 15:10:25 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: spiros at lokku.com > > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > > > On 6/15/07, rvos wrote: > >> Hi, > >> > >> I would very much prefer it if bioperl moved to svn. I'm > >> considering merging Bio::Phylo (to the extent that that's possible/ > >> practical) with bioperl and move it to an OBF repository, but I'd > >> rather not go back to CVS. > >> > >> Rutger > >> > > > > I second that, SVN seems like the reasonable choice. I would be more > > than happy to help out as well. > > > > Spiros > > > >> > >> -----Original Message----- > >> > >>> Date: Fri Jun 15 07:56:23 PDT 2007 > >>> From: "Chris Fields" > >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > >>> To: "Sendu Bala" > >>> > >>> > >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > >>> > >>>>>>> ... > >>>>>> Can we do any sort of massive conversion at some logical > >>>>>> timepoint. > >>>>>> Probably after a branch release or something? Because it > >>>>>> basically > >>>>>> means we're going to have differences on nearly every line > >>>>>> which is > >>>>>> going to make diff-ing difficult when debugging old/new versions. > >>>>>> Maybe it is not a problem because we aren't introducing and new > >>>>>> bugs! > >>>> > >>>> Sorry, can you clarify the problem you envisage? And why would > >>>> making a branch release help? > >>> > >>> Maybe the worry is that mass conversion in such a large codebase > >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows > >>> w/o > >>> trying? > >>> > >>>>> I agree; if we intend on doing this it should be all at once, > >>>>> maybe on a branch dedicated to ensure that code changes don't > >>>>> tank tests (they shouldn't but one never knows). We would then > >>>>> need a script up- and-running that tidies everything up prior to > >>>>> commits (though what happens if perltidy tanks?...). > >>>>> Sendu, up for it? > >>>> > >>>> If its going to be difficult and a hassle, for such an unnecessary > >>>> thing I'm not sure its worth it. There are more pressing things to > >>>> be done for Bioperl. > >>>> > >>>> If I can just run perltidy on the entire package and commit, I'd do > >>>> it. If that's not appropriate, I won't. > >>> > >>> The choices aren't necessarily all or nothing. What about > >>> voluntary, > >>> recommended use of a perltidy config file included with the > >>> distribution, with additional 'caveats'? See my response to Sean. > >>> > >>>>>>> About svn > >>>> [snip] > >>>>> Stepped into that one, didn't I! I'll look into how much effort > >>>>> is involved and try getting something going in the next month or > >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >>>>> well but it might be worth looking into. > >>>> > >>>> I'd put this in the unnecessary-but-nice category as well. If it > >>>> will be as easy as my ->new change, go ahead. If not, there are > >>>> more pressing matters (POD fixing, test script updating and > >>>> finishing...). > >>> > >>> A few other open-bio projects have actively discussed a CVS->SVN > >>> migration (BioRuby and I think BioPython, though the latter could be > >>> wrong). As I said, "it might be worth looking into" to weigh the > >>> pros/cons, get others opinions from others who have made the > >>> transition, etc. We could, as Jason suggested, even set up a tester > >>> SVN w/o making it the default codebase (lock it off to a few > >>> testers, > >>> have CVS commits automatically/manually carry over to SVN, etc). > >>> > >>> I agree with you that it's not feasible to switch over prior to a > >>> release and that there are more pressing issues, but it doesn't hurt > >>> having an open discussion about it. > >>> > >>> chris > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sdavis2 at mail.nih.gov Sat Jun 16 07:21:47 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Sat, 16 Jun 2007 07:21:47 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> Message-ID: <4673C7CB.1030709@mail.nih.gov> Chris Fields wrote: > I'll help out to the extent I can w/o having the SVN know-how. We > need (as Jason points out) someone who can detail the benefits and > maybe keep an updated journal on the wiki. > > I believe at least one or two of the other Bio* contemplated moving > over to SVN, which may be worth checking out. > The bioconductor project is on SVN. The project includes over 200 packages (the equivalent of perl modules) with something around 150-200 ACTIVE developers. They also have a build system for several OSes that operates on a cron-like system with builds of several versions approximately daily. Their system is running at something like revision 30,000, so they have significant experience. If anyone would like technical support, I can certainly ask the folks maintaining their site if they can give some input. Let me know if anyone would like a contact person. As for access, the typical access is over http (or https). Access controls can be set up on the server side while allowing anonymous access for checkout. There are many excellent SVN for every OS, so that should not be a problem. Sean From cjfields at uiuc.edu Sat Jun 16 10:02:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 09:02:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> On Jun 16, 2007, at 6:21 AM, Sean Davis wrote: > Chris Fields wrote: >> I'll help out to the extent I can w/o having the SVN know-how. We >> need (as Jason points out) someone who can detail the benefits and >> maybe keep an updated journal on the wiki. >> >> I believe at least one or two of the other Bio* contemplated moving >> over to SVN, which may be worth checking out. >> > The bioconductor project is on SVN. The project includes over 200 > packages (the equivalent of perl modules) with something around > 150-200 > ACTIVE developers. They also have a build system for several OSes > that > operates on a cron-like system with builds of several versions > approximately daily. Their system is running at something like > revision > 30,000, so they have significant experience. If anyone would like > technical support, I can certainly ask the folks maintaining their > site > if they can give some input. Let me know if anyone would like a > contact > person. > > As for access, the typical access is over http (or https). Access > controls can be set up on the server side while allowing anonymous > access for checkout. There are many excellent SVN for every OS, so > that > should not be a problem. > > Sean It looks like George Hartzell may be taking a crack at it, with Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we could have something testable relatively soon. After that we'll need to work out a few other issues, basically what's on Hilmar's list. chris From hlapp at gmx.net Sat Jun 16 10:40:08 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:40:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net> Just as an aside, even if we can't keep anonymous cvs working, I would think that using apache URL rewriting and a small CGI script that returns an appropriate page redirect we can without too much trouble keep the hyperlinks functional that people may have bookmarked -hilmar On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote: > Sounds like a plan, I'll be curious to see if we can still get keep > anonymous CVS working as I'd like to not have to pull the plug on > that. There are some threads out on the web about how to do this > with a commit rule on SVN. > > Also, can someone who is close enough to all the SVN benefits > please elaborate how it is going to help _this_ project? > Perhaps you would be willing to put a few words up -- like on (a to > be created): > http://bioperl.org/wiki/BioPerl:Version_control_changeover > > This way if anonymous CVS is broken and/or developers who haven't > been paying attention come back to commit code ask why things > changed we don't have to compose long emails... =) > > -jason > On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn >> >> - you document what you find yourself having to do in trying to make >> it work >> >> - you report back when you think you have a working repository >> >> - we all get a defined amount of time to test to our hearts' content, >> say 2 weeks >> >> - you fix issues that were encountered >> >> - report back when done, followed by retesting for, say 1 week >> >> - iterate previous 2 steps until no issues and no objections to >> migration >> >> - two more weeks of warning period to all developers to commit all >> outstanding changes, or reapply them to a future svn checkout >> >> - pull the trigger by locking down cvs, applying the migration as >> worked out before, and announcing that BioPerl is now on svn >> >> - get free beer at next BOSC (I'll pay if no one else does) >> >> This may not be precisely the plan that needs to be executed, but >> it's probably somewhere along those lines. >> >> If there are volunteers who would like to spearhead this, then power >> to you - I think everyone is in favor and the advantages of svn don't >> need to be debated. The only reason it hasn't happened yet is because >> no one has stepped forward who would have the energy. > >> >> I'm sure ChrisD will gladly create the svn sandbox if we have >> volunteers lined up to get going. >> >> -hilmar >> >> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: >> >>> On 6/15/07, rvos wrote: >>>> Hi, >>>> >>>> I would very much prefer it if bioperl moved to svn. I'm >>>> considering merging Bio::Phylo (to the extent that that's possible/ >>>> practical) with bioperl and move it to an OBF repository, but I'd >>>> rather not go back to CVS. >>>> >>>> Rutger >>>> >>> >>> I second that, SVN seems like the reasonable choice. I would be more >>> than happy to help out as well. >>> >>> Spiros >>> >>>> >>>> -----Original Message----- >>>> >>>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>>> From: "Chris Fields" >>>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>>> To: "Sendu Bala" >>>>> >>>>> >>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>>> >>>>>>>>> ... >>>>>>>> Can we do any sort of massive conversion at some logical >>>>>>>> timepoint. >>>>>>>> Probably after a branch release or something? Because it >>>>>>>> basically >>>>>>>> means we're going to have differences on nearly every line >>>>>>>> which is >>>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>>> versions. >>>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>>> bugs! >>>>>> >>>>>> Sorry, can you clarify the problem you envisage? And why would >>>>>> making a branch release help? >>>>> >>>>> Maybe the worry is that mass conversion in such a large codebase >>>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>>> w/o >>>>> trying? >>>>> >>>>>>> I agree; if we intend on doing this it should be all at once, >>>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>>> need a script up- and-running that tidies everything up prior to >>>>>>> commits (though what happens if perltidy tanks?...). >>>>>>> Sendu, up for it? >>>>>> >>>>>> If its going to be difficult and a hassle, for such an >>>>>> unnecessary >>>>>> thing I'm not sure its worth it. There are more pressing >>>>>> things to >>>>>> be done for Bioperl. >>>>>> >>>>>> If I can just run perltidy on the entire package and commit, >>>>>> I'd do >>>>>> it. If that's not appropriate, I won't. >>>>> >>>>> The choices aren't necessarily all or nothing. What about >>>>> voluntary, >>>>> recommended use of a perltidy config file included with the >>>>> distribution, with additional 'caveats'? See my response to Sean. >>>>> >>>>>>>>> About svn >>>>>> [snip] >>>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>>> is involved and try getting something going in the next >>>>>>> month or >>>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>>> well but it might be worth looking into. >>>>>> >>>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>>> more pressing matters (POD fixing, test script updating and >>>>>> finishing...). >>>>> >>>>> A few other open-bio projects have actively discussed a CVS->SVN >>>>> migration (BioRuby and I think BioPython, though the latter >>>>> could be >>>>> wrong). As I said, "it might be worth looking into" to weigh the >>>>> pros/cons, get others opinions >from others who have made the >>>>> transition, etc. We could, as Jason suggested, even set up a >>>>> tester >>>>> SVN w/o making it the default codebase (lock it off to a few >>>>> testers, >>>>> have CVS commits automatically/manually carry over to SVN, etc). >>>>> >>>>> I agree with you that it's not feasible to switch over prior to a >>>>> release and that there are more pressing issues, but it doesn't >>>>> hurt >>>>> having an open discussion about it. >>>>> >>>>> chris >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Jun 16 10:55:09 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:55:09 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > As for access, the typical access is over http (or https). We're using svn+ssh here (NESCent) so the password is the same as the one you set for your account on the server, and you can use public/ private key negotiation for authentication. I think the ability to not provide a password for every single interaction is a requirement. If that requires using svn+ssh or can be made to work through https too I don't know. On sf.net I have to use https for svn and it doesn't ask me for the password each time. Not sure how this works though, maybe some local caching? We should not be using http, or whatever other protocol that sends unencrypted passwords. > Access controls can be set up on the server side while allowing > anonymous access for checkout. There are many excellent SVN for > every OS, so that should not be a problem. On Mac OSX the most convenient way I have found is through fink. It does ask to install 30 other dependencies, which had me balk at first, but me doing it by hand is even worse than fink doing it, so I finally gave in and it's really a breeze. I've not had a single issue. From a sysadmin perspective, what might be worth keeping in mind is that svn is going to store everything in a database (BerkeleyDB I think). I.e., there is no such thing anymore as restoring individual source code files from backup if one gets accidentally corrupted on the server. It seems you have to restore the entire database, i.e., the entire repository. I vaguely recall though that how svn manages the repository is actually configurable and that other storage than DB is possible too. Don't ask me for the pros and cons of one vs the other. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Sat Jun 16 13:09:18 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). Rutger -----Original Message----- > Date: Sat Jun 16 07:55:09 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sean Davis" > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. > > > Access controls can be set up on the server side while allowing > > anonymous access for checkout. There are many excellent SVN for > > every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rvos at interchange.ubc.ca Sat Jun 16 13:15:45 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore). Rutger -----Original Message----- > Date: Sat Jun 16 10:09:18 PDT 2007 > From: "rvos" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Hilmar Lapp" , "Sean Davis" > > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > > -----Original Message----- > > > Date: Sat Jun 16 07:55:09 PDT 2007 > > From: "Hilmar Lapp" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sean Davis" > > > > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > > > As for access, the typical access is over http (or https). > > > > We're using svn+ssh here (NESCent) so the password is the same as the > > one you set for your account on the server, and you can use public/ > > private key negotiation for authentication. > > > > I think the ability to not provide a password for every single > > interaction is a requirement. If that requires using svn+ssh or can > > be made to work through https too I don't know. On sf.net I have to > > use https for svn and it doesn't ask me for the password each time. > > Not sure how this works though, maybe some local caching? > > > > We should not be using http, or whatever other protocol that sends > > unencrypted passwords. > > > > > Access controls can be set up on the server side while allowing > > > anonymous access for checkout. There are many excellent SVN for > > > every OS, so that should not be a problem. > > > > On Mac OSX the most convenient way I have found is through fink. It > > does ask to install 30 other dependencies, which had me balk at > > first, but me doing it by hand is even worse than fink doing it, so I > > finally gave in and it's really a breeze. I've not had a single issue. > > > > From a sysadmin perspective, what might be worth keeping in mind is > > that svn is going to store everything in a database (BerkeleyDB I > > think). I.e., there is no such thing anymore as restoring individual > > source code files from backup if one gets accidentally corrupted on > > the server. It seems you have to restore the entire database, i.e., > > the entire repository. I vaguely recall though that how svn manages > > the repository is actually configurable and that other storage than > > DB is possible too. Don't ask me for the pros and cons of one vs the > > other. > > > > -hilmar > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From george.heller at yahoo.com Sat Jun 16 13:29:26 2007 From: george.heller at yahoo.com (George Heller) Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com> Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? George --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From bix at sendu.me.uk Sat Jun 16 14:21:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 16 Jun 2007 19:21:38 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com> References: <959624.48556.qm@web56502.mail.re3.yahoo.com> Message-ID: <46742A32.90305@sendu.me.uk> George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). From cjfields at uiuc.edu Sat Jun 16 15:23:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:23:43 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote: > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > >> As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. Agreed; it should be through ssh. >> Access controls can be set up on the server side while allowing >> anonymous access for checkout. There are many excellent SVN for >> every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. MacPorts/DarwinPorts also has subversion, various language bindings, cvs2svn, and various perl modules. There are also a few SVN GUIs lingering around (including live folders within Komodo). chris From cjfields at uiuc.edu Sat Jun 16 15:18:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:18:06 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu> I think it's viable as an option if the code really needs it. After 100+ commits some of the code has schizy coding styles, so cleaning it up helps. In those cases having a perltidy config file present wouldn't hurt. However I agree that it shouldn't be applied across every module and should be done judiciously (the commit message, for instance, should actually state the code was tidied). chris PS - Nice to see the ball is rolling on SVN! On Jun 16, 2007, at 12:15 PM, rvos wrote: > A brief word on the topic of perltidy: no. I like what it does, and > I sort of follow one of its settings (-syn -sob -b), but if you run > it on a whole source tree it'll screw up the diffs, and I'm still > worried about it breaking things (though really it shouldn't, it > creates a *.bak if something doesn't compile anymore). > > Rutger > > > > -----Original Message----- > >> Date: Sat Jun 16 10:09:18 PDT 2007 >> From: "rvos" >> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >> To: "Hilmar Lapp" , "Sean Davis" >> >> >> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales >> talk has been expended over it already, for my own purpose I like >> the integration with eclipse (through subclipse plugin) and >> komodo, in addition to the atomic commits (so I can ctrl+c if I >> goof up (again)). >> >> For standalone use on osx I didn't use the fink one, but I forgot >> where I did get it from. It was very easy to set up, though. On >> windows there is a really nice standalone one (tortoisesvn) that >> integrates with the explorer so you can see on the file icons what >> the state of a file is. I know that there's a cvs2svn utility that >> converts your revision history (seems a requirement). >> >> Rutger >> >> >> -----Original Message----- >> >>> Date: Sat Jun 16 07:55:09 PDT 2007 >>> From: "Hilmar Lapp" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sean Davis" >>> >>> >>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >>> >>>> As for access, the typical access is over http (or https). >>> >>> We're using svn+ssh here (NESCent) so the password is the same as >>> the >>> one you set for your account on the server, and you can use public/ >>> private key negotiation for authentication. >>> >>> I think the ability to not provide a password for every single >>> interaction is a requirement. If that requires using svn+ssh or can >>> be made to work through https too I don't know. On sf.net I have to >>> use https for svn and it doesn't ask me for the password each time. >>> Not sure how this works though, maybe some local caching? >>> >>> We should not be using http, or whatever other protocol that sends >>> unencrypted passwords. >>> >>>> Access controls can be set up on the server side while allowing >>>> anonymous access for checkout. There are many excellent SVN for >>>> every OS, so that should not be a problem. >>> >>> On Mac OSX the most convenient way I have found is through fink. It >>> does ask to install 30 other dependencies, which had me balk at >>> first, but me doing it by hand is even worse than fink doing it, >>> so I >>> finally gave in and it's really a breeze. I've not had a single >>> issue. >>> >>> From a sysadmin perspective, what might be worth keeping in >>> mind is >>> that svn is going to store everything in a database (BerkeleyDB I >>> think). I.e., there is no such thing anymore as restoring individual >>> source code files from backup if one gets accidentally corrupted on >>> the server. It seems you have to restore the entire database, i.e., >>> the entire repository. I vaguely recall though that how svn manages >>> the repository is actually configurable and that other storage than >>> DB is possible too. Don't ask me for the pros and cons of one vs the >>> other. >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 16 13:47:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 10:47:01 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: <18036.8725.29073.619527@almost.alerce.com> Chris Fields writes: > Ah, got it. Sorry. > > George, planning on taking this up? I'm going to take a *peek*. I just finished (unless someone finds another issue) moving someone's cvs repository over to svn, so I have some tools cobbled together and some knowledge in the cache. I don't have too much idle time at the moment though, so if it gets gooey I'll just summarize what I learn. Either way it seems worth a peek. I will need the repository itself though. I'll post a note to support at open-bio.org. g. From jason at bioperl.org Sat Jun 16 19:54:18 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 16:54:18 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18036.8725.29073.619527@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> <18036.8725.29073.619527@almost.alerce.com> Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org> Thanks George. I'll respond to your support ticket as well but I put up tarballs of the repository as of today. I had thought at one point ChrisD might have setup rsync-able access to the whole repostitory through code.open-bio.org but for now I have put up tarballs of most of the CVS dirs from bioperl http://bioperl.org/uploads/ Just to say I already went through all the steps of running cvs2svn myself and had problems gathering back out the branches and all the tags when I tried it. If you want to start with a smaller repository like bioperl-network or bioperl-db as the initial cvs2svn conversion script took quite a long time to run on bioperl-live. Regarding ssh/https: We have already gone through some of this for blipkit and biojava projects. I think we'll still keep separate anonymous read-only (code.open-bio.org) and writeable repositories (dev.open-bio.org) as I think we are resisting any webapps on the developement server as we want that to as locked down as possible. For the newly created svn repositories that I've been creating/using I just use svn+ssh and that worked okay. -jason On Jun 16, 2007, at 10:47 AM, George Hartzell wrote: > Chris Fields writes: >> Ah, got it. Sorry. >> >> George, planning on taking this up? > > I'm going to take a *peek*. I just finished (unless someone finds > another issue) moving someone's cvs repository over to svn, so I have > some tools cobbled together and some knowledge in the cache. > > I don't have too much idle time at the moment though, so if it gets > gooey I'll just summarize what I learn. Either way it seems worth a > peek. > > I will need the repository itself though. I'll post a note to > support at open-bio.org. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hartzell at alerce.com Sat Jun 16 19:56:09 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 16:56:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46739D69.4090204@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <46739D69.4090204@sheffield.ac.uk> Message-ID: <18036.30873.609341.181853@almost.alerce.com> Nathan S. Haigh writes: > [...] > Sounds like George might know what he's doing! Hey, I've been looking for a Marketing Director. Want a job? > I have a question about > setting up svn access. I believe access can be done in several ways, > over webdav, over ssh and probably others too. Do you have any knowledge > about the benefits of one over the other? I suppose I'm thinking of what > to implement to allow anonymous read access for users and authenticated > access for developers. There are two and a half ways to talk to the repository: - You can put it behind a web server (e.g. apache) and get at it using http/https. Authentication and authorization happen using the normal web server tricks, so as long as you don't do anything silly (e.g. don't use basic auth, stick with mod_auth_digest), even http connections won't send passwords in the clear. You can define users in .htpassword files or use any of the fancier setup (e.g. sql databases, etc...). - You can talk to it via subversion's simple server, svnserve. There are two ways you usually talk to svnserve (neither of which send passwords in the clear): * directly, using a URL like svn:/svn.example.com/repo/proj/trunk when you do this the client either talks directly to a copy of svnserve running as a daemon, or possibly to something like inetd that'll start an svnserve as necessary. In this case, you define authen. and author. info in an svnserve.conf file. * indirectly, using a URL like svn+ssh://svn.example.com/repo/proj/trunk/ in which case you make an ssh connection to the server machine (and authenticate via ssh mechanisms, anything other than a key-pair will drive you nuts with repeated password requests) and then an svnserve process is started up for you in "tunnel mode". Access control is coarse grained an via OS level access permisions. Generally in this case you need to give out shell accounts to everyone involved, or (tsk, tsk) have them use a common account. There's a cute trick in the svn book that shows how to use a shared ssh account but still have all of the changes in the repo keep track of the real user. I've never tried it.... - If you're on the same machine as the repo, you can do this simple: file:///path/to/repo/proj/trunk The biggest deciding factor is how you want to manage your users and whether you're already messing around with a web server. I've generally worked in small group and everyone's had ssh access, but I've set it up the other ways too. You can even access via multiple paths. The only trick is that the repository needs to be writable by whoever's committing, and if they're running svnserve themselves (file: or svn+ssh:) and things aren't set up right (all the dirs in the repo need to be group writable and have the magic bit set so that any new stuff created is also writable, users umasks and group membership need to be aligned) then things go fubar. Google's your friend here, and each of the OS's/distro's has a standard hack for making this work, usually involving a wrapper app that takes care of things. Feel free to ask any particular questions. Phew, g. From jason at bioperl.org Sat Jun 16 20:17:58 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 17:17:58 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> <200706151653.04135.sheris@eps.berkeley.edu> Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org> There error is clearly saying there must be a symbol or letter in your sequence that violates the regexp. I had modified the code in CVS to actually provide a more informative mismatch error in the error message, but this probably not in the release you are using. Anyways, add this to see what is causing the problem: print join(",",($nstarthash{$_}[1] =~ /([^ $Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n"; -jason On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote: > Thanks for the suggestion, but that still gives the same error as > before. > > On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: >>> I'm getting an error as follows when I try to reverse >>> complement a sequence string stored in a hash of arrays. The >>> storage code is: >>> >>> $nstarthash{$key} = [$sortchecks[0], join("", >>> @nseq), >>> join("",@{$seqhash{$key}})]; >>> >>> the sequence of interest is the element at index 1. >>> >>> Later, I try to retrieve this string for a subset of keys so >>> I can reverse complement it based on input from another hash >>> (%complement): >>> >>> my %revcomphash = map { my $read = $_; >>> grep $complement{$read} eq 'C', %complement; >>> {$_, (Bio::Seq->new(-seq >>> =>$nstarthash{$_}[1]))->revcom->seq()};} >>> keys(%nstarthash); >>> >>> >>> I get the following warning (long sequence edited for clarity): >>> >>> -- -------------------- WARNING --------------------- >>> MSG: seq doesn't validate, mismatch is 1 >>> --------------------------------------------------- >>> >>> ------------- EXCEPTION ------------- >>> MSG: Attempting to set the sequence to >>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] >>> which does not look healthy >>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 >>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 >>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK >>> toplevel ../quality_wrapper.pl:103 >>> >>> I cannot find any non-allowed characters in the sequence, and >>> the de-referencing appears to work correctly. Can anyone help me? >>> I'm using the latest Bioperl installation (1.5.2) with >>> ActivePerl5.8 on a Mepis 6.5 system. >> >> Try telling the Bio::Seq object what alphabet to use when creating >> it. >> I tend to create them like: >> >> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') > > -- > Sheri Simmons > Department of Earth and Planetary Sciences > University of California, Berkeley > Berkeley, CA 94720-4767 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From n.haigh at sheffield.ac.uk Sun Jun 17 07:45:11 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 17 Jun 2007 12:45:11 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <46751EC7.8020609@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 rvos wrote: > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > Just to clarify, subversion is available as command line for windows: http://subversion.tigris.org/project_packages.html TortoiseSVN is another svn client with a GUI that integrates into the shell. I tried setting this up a while back to use ssh (via PUTTY), but I wasn't successful. This may have been due to me just starting out with svn or that it was harder to setup in an earlier version of TortoiseSVN. Does anyone have experience of setting up svn on Windows to use ssh? If the changeover takes place, I'm happy to write some howto's for setting up svn clients for Windows. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v 8xHJvn/Eqf9LePR3Ei0ZaIw= =t5pN -----END PGP SIGNATURE----- From george.heller at yahoo.com Sun Jun 17 14:41:55 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com> Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From jason at bioperl.org Sun Jun 17 16:48:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Sun, 17 Jun 2007 13:48:05 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com> References: <148654.15952.qm@web56511.mail.re3.yahoo.com> Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: > Hi all, > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > Thanks. > George > > Sendu Bala wrote: > George Heller wrote: >> Hi all, >> >> I am looking at extracting the taxonomy hierarchy for some taxon ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From aaron.j.mackey at gsk.com Sun Jun 17 22:35:42 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:35:42 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: To do so efficiently, you might want to check out: http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html -Aaron bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM: > George Heller wrote: > > Hi all, > > > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > > What I plan to do is, for a given taxon id, say 33090, I want to > > extract all taxon ids that are children of this species. I do not > > just want the immediate children, but the children's children and so > > on. > > > > Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not share it > with us and we could add it to the Taxonomy module(s). > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From aaron.j.mackey at gsk.com Sun Jun 17 22:34:12 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:34:12 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: Message-ID: > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) Let me just note that https is preferable to ssh for those poor slobs stuck behind a corporate firewall (svn happily prompts me for my proxy server's user/pass, then my https authentication realm's user/pass - all then get cached in some .svn/ file that I don't have to worry about again until my proxy server password changes once a month ...) -Aaron From george.heller at yahoo.com Mon Jun 18 00:21:45 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com> Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. From bix at sendu.me.uk Mon Jun 18 06:44:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:44:00 +0100 Subject: [Bioperl-l] Network tests overhaul Message-ID: <467661F0.2060703@sendu.me.uk> When the test suite runs currently, most (the intent is all) tests skip if the test would require network (internet) access. This is to avoid tests failing not due to bugs in Bioperl code, but due to temporarily inaccessible servers. This is also to make running the test suite faster. To do a complete test you currently have to set BIOPERLDEBUG to true, which activates the network test but also increases verbosity. This actually causes a problem, since when running the entire test suite the additional debug information is more a hindrance than a help, since the reams of printed information can hide significant warnings that may also get printed. Its also ugly. The solution is to divorce activation of network tests from the request for verbosity. The obvious implementation is to have another environment variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do something more appropriate. The running of networking tests should be a choice given to every end-user installing Bioperl. Debugging information, on the other hand, is only of interest to the developer working on a specific module under test, so can be left as a 'hidden' env var. I have just committed one possible implementation along these lines. You say: perl Build.PL as normal, and if you seem to have internet access it asks you if you'd like to run network tests. The default answer is no. If you answer yes, network tests will be enabled. You can alternatively say: perl Build.PL --network and if you seem to have internet access, network tests will be enabled. Then you run the tests: ./Build test Any tests written to support the new system will then skip network tests if they haven't been enabled. The only test I've written to support the new system is t/RemoteBlast.t: ./Build test --test_files t/RemoteBlast.t --verbose Adding support to test scripts consists of the following changes: + use Module::Build; + my $build = Module::Build->current(get_options => { network => {} }); + my $do_network_tests = $build->notes('network'); ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests --- ! if (!$do_network_tests) { # skip network tests I propose adding this support to all test scripts that carry out network tests. Does anyone have objections? Does anyone have alternate implementations that may be superior? I specifically suggest we don't use an env var in addition to the above, because the multiple ways of doing things could lead to confusion. Which takes priority? Did a user really have the networking tests turned on when he reported his test results? The one thing I need help with is identifying which tests attempt to access the internet. I think we caught most of them for the 1.5.2 release, but I think there are more lurking around. Can anyone offer a way to systematically find at least the test scripts which access the internet, if not the specific tests within? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 06:46:17 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:46:17 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: <46766279.7050202@sendu.me.uk> Sendu Bala wrote: > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => {} }); That should read: + my $build = Module::Build->current(); > + my $do_network_tests = $build->notes('network'); From cjfields at uiuc.edu Mon Jun 18 07:45:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 06:45:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <46766279.7050202@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: The idea sounds good, though if we plan on doing this we need to update the Test HOWTO as well. Some modules require only a few (<50% of the total) network tests; I think SeqFeature.t may be one, though I'm not sure. Does this handle those cases? chris On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Adding support to test scripts consists of the following changes: >> >> + use Module::Build; >> + my $build = Module::Build->current(get_options => { network => >> {} }); > > That should read: > + my $build = Module::Build->current(); > >> + my $do_network_tests = $build->notes('network'); > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Jun 18 07:49:18 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 12:49:18 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: <4676713E.1000508@sendu.me.uk> Chris Fields wrote: > The idea sounds good, though if we plan on doing this we need to update > the Test HOWTO as well. > > Some modules require only a few (<50% of the total) network tests; I > think SeqFeature.t may be one, though I'm not sure. Does this handle > those cases? Yes, the system just gives the test script a boolean describing if network tests should be run. The script can then do whatever it wants with the boolean. Skip all tests, skip no tests, skip just some tests... its a drop-in replacement for the current 'debug' boolean used based on BIOPERLDEBUG. From hlapp at gmx.net Mon Jun 18 08:38:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:38:25 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com> References: <487845.37410.qm@web56510.mail.re3.yahoo.com> Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 08:44:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:44:22 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Just curious - how do you cvs commit then to an external repository? Is that open in the firewall? It is true though that corporations typically will not permit any encrypted outgoing traffic through their firewall except https. sf.net only supports https for svn, AFAIK. -hilmar On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass > - all > then get cached in some .svn/ file that I don't have to worry about > again > until my proxy server password changes once a month ...) > > -Aaron > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 08:47:56 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:47:56 -0400 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Sounds like a great idea to me. -hilmar On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote: > When the test suite runs currently, most (the intent is all) tests > skip > if the test would require network (internet) access. This is to avoid > tests failing not due to bugs in Bioperl code, but due to temporarily > inaccessible servers. This is also to make running the test suite > faster. > > To do a complete test you currently have to set BIOPERLDEBUG to true, > which activates the network test but also increases verbosity. This > actually causes a problem, since when running the entire test suite > the > additional debug information is more a hindrance than a help, since > the > reams of printed information can hide significant warnings that may > also > get printed. Its also ugly. > > The solution is to divorce activation of network tests from the > request > for verbosity. The obvious implementation is to have another > environment > variable, perhaps BIOPERLNETWORK. However, there is an opportunity > to do > something more appropriate. The running of networking tests should > be a > choice given to every end-user installing Bioperl. Debugging > information, on the other hand, is only of interest to the developer > working on a specific module under test, so can be left as a 'hidden' > env var. > > > I have just committed one possible implementation along these lines. > > You say: > perl Build.PL > as normal, and if you seem to have internet access it asks you if > you'd > like to run network tests. The default answer is no. If you answer > yes, > network tests will be enabled. > > You can alternatively say: > perl Build.PL --network > and if you seem to have internet access, network tests will be > enabled. > > Then you run the tests: > ./Build test > Any tests written to support the new system will then skip network > tests > if they haven't been enabled. > > The only test I've written to support the new system is t/ > RemoteBlast.t: > ./Build test --test_files t/RemoteBlast.t --verbose > > > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => > {} }); > + my $do_network_tests = $build->notes('network'); > > ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests > --- > ! if (!$do_network_tests) { # skip network tests > > > I propose adding this support to all test scripts that carry out > network > tests. Does anyone have objections? Does anyone have alternate > implementations that may be superior? > > I specifically suggest we don't use an env var in addition to the > above, > because the multiple ways of doing things could lead to confusion. > Which > takes priority? Did a user really have the networking tests turned on > when he reported his test results? > > > The one thing I need help with is identifying which tests attempt to > access the internet. I think we caught most of them for the 1.5.2 > release, but I think there are more lurking around. Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 08:55:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 07:55:53 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote: > Just curious - how do you cvs commit then to an external repository? > Is that open in the firewall? > > It is true though that corporations typically will not permit any > encrypted outgoing traffic through their firewall except https. > sf.net only supports https for svn, AFAIK. > > -hilmar If so it may be better to allow https, though I don't know how Chris D. and others feel about it. Did we make a decision as to the fate of cvs if we get svn up-and- running? Keep it around (assuming svn commits would be carried over to cvs and vice versa)? Or see what happens over time? chris From sdavis2 at mail.nih.gov Mon Jun 18 09:05:50 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 09:05:50 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <4676832E.5080704@mail.nih.gov> aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass - all > then get cached in some .svn/ file that I don't have to worry about again > until my proxy server password changes once a month ...) That would be my suggestion as well (although I added it only parenthetically). Sean From hlapp at gmx.net Mon Jun 18 09:13:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 09:13:27 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried > over to cvs and vice versa)? Or see what happens over time? Let's not plan for having cvs and svn writable repositories in parallel - that would create an administrative nightmare. Once the tests complete, there'll be a clean cut-over. What Jason suggested is to try and continue a read-only (anonymous) cvs repository, updated from the svn repository that the developers use, aside from an anonymous svn repository mirroring the writable one. This would primarily be for maintaining working URLs for those folks who http-linked into the anonymous cvs repository. What I added earlier is that even if that fails to be feasible, you can achieve the goal using some small CGI script and apache redirect to map CVS- style links to the anonymous svn repository. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 09:31:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:31:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu> On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote: > > On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Let's not plan for having cvs and svn writable repositories in > parallel - that would create an administrative nightmare. Once the > tests complete, there'll be a clean cut-over. My thoughts as well. Much simpler. > What Jason suggested is to try and continue a read-only (anonymous) > cvs repository, updated from the svn repository that the developers > use, aside from an anonymous svn repository mirroring the writable > one. This would primarily be for maintaining working URLs for those > folks who http-linked into the anonymous cvs repository. What I > added earlier is that even if that fails to be feasible, you can > achieve the goal using some small CGI script and apache redirect to > map CVS-style links to the anonymous svn repository. > > -hilmar I like the idea of a read-only cvs or a 'faux' cvs, though the former would initially be easier as we already have it available. We could just lock it down at some switchover point to read-only (something I think Jason also suggested). chris From bix at sendu.me.uk Mon Jun 18 09:13:33 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:13:33 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> Message-ID: <467684FD.3080300@sendu.me.uk> Chris Fields wrote: > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing I'm not sure its worth it. There are more pressing things to be >> done for Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd do >> it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? I'm happy with that idea. Why not come up with something and make it available for us to try out? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 09:26:36 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:26:36 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <4676880C.9030009@sendu.me.uk> Chris Fields wrote: > If so it may be better to allow https, though I don't know how Chris > D. and others feel about it. If it makes no difference to me as an end-user, I won't mind. But I won't want to enter my password even once, at the beginning of a session. If that's not possible with https, then ssh should be an option as well. Unrelated, but it randomly just occurred to me: what happens to all the id lines at the top of modules? Eg: $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ That's a cvs-specific thing, right? Do we delete them all? (Regardless, I wish we would, since they caused me no end of hassles during the 1.5.2 release, doing updates across branches.) > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried over > to cvs and vice versa)? Or see what happens over time? Well, I don't think hard decisions are possible until we know how its going to work in practice. I tried setting up my own svn repository once, but didn't keep it and can't remember much about it. So, I suppose we'll play it by ear and decide things later. Is someone out there actively doing something leading toward a demonstration of how it will be? From cjfields at uiuc.edu Mon Jun 18 09:58:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:58:34 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467684FD.3080300@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing I'm not sure its worth it. There are more pressing things >>> to be >>> done for Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do >>> it. If that's not appropriate, I won't. >> >> The choices aren't necessarily all or nothing. What about voluntary, >> recommended use of a perltidy config file included with the >> distribution, with additional 'caveats'? > > I'm happy with that idea. Why not come up with something and make it > available for us to try out? > > > Cheers, > Sendu. Will do. Maybe something that conforms to PBP; there's a PBP perltidy config on perlmonks, along with some emacs/vim related bits: http://www.perlmonks.org/?node_id=516501 chris From sdavis2 at mail.nih.gov Mon Jun 18 10:03:35 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 10:03:35 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <467690B7.7090105@mail.nih.gov> Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how Chris >> D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an option > as well. > > > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) See here: http://svnbook.red-bean.com/en/1.0/ch07s02.html Check out the section at the bottom having to do with svn:keywords. Sean From akarger at CGR.Harvard.edu Mon Jun 18 10:10:57 2007 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 18 Jun 2007 10:10:57 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46751EC7.8020609@sheffield.ac.uk> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> <46751EC7.8020609@sheffield.ac.uk> Message-ID: > Just to clarify, subversion is available as command line for windows: > http://subversion.tigris.org/project_packages.html > > TortoiseSVN is another svn client with a GUI that integrates into the > shell. I tried setting this up a while back to use ssh (via > PUTTY), but > I wasn't successful. This may have been due to me just > starting out with > svn or that it was harder to setup in an earlier version of > TortoiseSVN. > > Does anyone have experience of setting up svn on Windows to > use ssh? If > the changeover takes place, I'm happy to write some howto's > for setting > up svn clients for Windows. Here are some notes I wrote recently. I'm using this with command-line svn, not TortoiseSVN. I would hope that it would work with Tortoise, too, but I can't guarantee. 1. Run PuTTYgen (installed with PuTTY, probably in Start menu->Programs->PuTTY) and follow directions to create a private key file like C:\someplace\private_key.ppk and a public key. At this point, you'll pick an ssh password, which is separate from your login password. 2. Get an account with the appropriate .ssh/authorized_keys file on the host machine. (This is not Windows-specific. By the way, if you change the lines of the authorized_keys file to start with, e.g., command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB... comment then (a) you're more secure because users can't open a real shell on the computer, and (b) users don't need to type the repository directory in their svn co commands.) 3. Set your environment variables (My Computer->Properties. Advanced Tab, click on Environment Variables. In the top half ("User variables for ..."), click "New" and put in the variable name and value. 3a. Set the SVN_EDITOR environment variable to your favorite editor, such as vim or emacs, or a full path to some other editor. If it's not set, then either VISUAL or EDITOR must be set. 3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program, which is the Windows equivalent of command-line ssh. If you installed PuTTY in the default location, set it to "C:/Program Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the quotation marks in the environment variable. 4. When you want to start using svn, you'll need to run Pageant (Start menu->Programs->PuTTY), select "Add Key", browse to your private key file, and enter the ssh password you chose in step 1 (not your login password). Pageant will stay running until you quit it or logout, so you can have multiple svn checkins etc., and you only need to type in your password once. 5. Now just run command-line svn commands the same way you would on UNIX (modulo Windows' brain-dead shell). -Amir Karger From cjfields at uiuc.edu Mon Jun 18 10:24:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 09:24:00 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how >> Chris D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an > option as well. Aaron pointed out in a related post that https access is the preferred option behind a corporate firewall (svn prompts for proxy user/pass, then caches it). Not sure how Jason/Hilmar/Chris D. feel about https or supporting both https+ssh. ... >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Well, I don't think hard decisions are possible until we know how > its going to work in practice. I tried setting up my own svn > repository once, but didn't keep it and can't remember much about it. Agree; we'll need to work out specifics once we know how things work out using cvs2svn. I think the idea is to test using a smaller distribution (maybe network or db) and move up from there. > So, I suppose we'll play it by ear and decide things later. Is > someone out there actively doing something leading toward a > demonstration of how it will be? George Hartzell is going to test it out, I believe, and will post something when he can. chris From dmessina at wustl.edu Mon Jun 18 10:54:31 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 09:54:31 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> [Chris F] > Will do. Maybe something that conforms to PBP; there's a PBP > perltidy config on perlmonks, along with some emacs/vim related bits: > > http://www.perlmonks.org/?node_id=516501 FYI, perltidy now has a built-in -pbp flag: [from perltidy-20070508] > -pbp, --perl-best-practices > -pbp is an abbreviation for the parameters in the book Perl Best > Practices by Damian Conway: > > -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 > -nsfs -nolq > -wbb="% + - * / x != == >= <= =~ !~ < > | & = > **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" > Note that the -st and -se flags make perltidy act as a filter on > one file only. These can be overridden with -nst and -nse if > necessary. > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ bin/perltidy] Dave From dmessina at wustl.edu Mon Jun 18 11:04:10 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 10:04:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Awesome, Sendu! Really glad you implemented this. > Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? I think tests would be accessing the net indirectly through a BioPerl module (which may also be using indirect access), so it'd be hard to come up with a universal glob for that. However: % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l 108 % ls -1 bioperl-live/t | wc -l 248 Less than half of the test files use BIOPERLDEBUG, so that narrows down the possibilities... Dave From bix at sendu.me.uk Mon Jun 18 11:09:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 16:09:19 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> Message-ID: <4676A01F.30205@sendu.me.uk> David Messina wrote: >> Can anyone offer a >> way to systematically find at least the test scripts which access the >> internet, if not the specific tests within? > > I think tests would be accessing the net indirectly through a BioPerl > module (which may also be using indirect access), so it'd be hard to > come up with a universal glob for that. > > However: > > % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l > 108 > > % ls -1 bioperl-live/t | wc -l > 248 > > Less than half of the test files use BIOPERLDEBUG, so that narrows down > the possibilities... Not necessarily. The problem is that there may be test scripts that have never even tried to skip network tests, and therefore don't use BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) I was thinking along the lines of, does anyone know how to monitor accesses to the network card (or equivalent), getting information on which program (test script) requested the access? From cjfields at uiuc.edu Mon Jun 18 11:41:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 10:41:28 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> Message-ID: On Jun 18, 2007, at 9:54 AM, David Messina wrote: > [Chris F] >> Will do. Maybe something that conforms to PBP; there's a PBP >> perltidy config on perlmonks, along with some emacs/vim related bits: >> >> http://www.perlmonks.org/?node_id=516501 > > > FYI, perltidy now has a built-in -pbp flag: > > [from perltidy-20070508] >> -pbp, --perl-best-practices >> -pbp is an abbreviation for the parameters in the book Perl Best >> Practices by Damian Conway: >> >> -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 >> -nsfs -nolq >> -wbb="% + - * / x != == >= <= =~ !~ < > | & = >> **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" >> Note that the -st and -se flags make perltidy act as a filter on >> one file only. These can be overridden with -nst and -nse if >> necessary. >> > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ > bin/perltidy] > > > Dave Makes sense that would eventually be incorporated. If so there's no need to include a config (unless we want to sway away from PBP-style). We can just recommend everyone use that setting. chris From cjfields at uiuc.edu Mon Jun 18 12:06:26 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:06:26 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote: > David Messina wrote: >>> ... >> Less than half of the test files use BIOPERLDEBUG, so that narrows >> down >> the possibilities... > > Not necessarily. The problem is that there may be test scripts that > have > never even tried to skip network tests, and therefore don't use > BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) > > I was thinking along the lines of, does anyone know how to monitor > accesses to the network card (or equivalent), getting information on > which program (test script) requested the access? EUtilities.t uses network tests predominately. I'll switch over when I commit everything from the overhaul. Couldn't you enable BIOPERLDEBUG, disable network access, then iterate through tests checking for those which fail or skip? I think Test::Harness has a way to do this, using execute_tests(). chris From bix at sendu.me.uk Mon Jun 18 12:34:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 17:34:38 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> Message-ID: <4676B41E.3050706@sendu.me.uk> Chris Fields wrote: > Couldn't you enable BIOPERLDEBUG, disable network access, then iterate > through tests checking for those which fail or skip? Yes, good idea, though my dev machine is also my email/webserver so I'd rather come up with an alternate solution than one involving 'disable network access'. Still, that's what I'll probably end up doing. Cheers! Oh, Chris, Spiros, how goes the Test::More conversion? I might want to wait for you to finish, or join in? If you're not going to have time to do any more in the next few weeks, can you please update http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in the opposite case, add your name in)? Its not quite clear to me which tests are assigned to whom. Can someone clarify what the markings mean? Cheers, Sendu. From cjfields at uiuc.edu Mon Jun 18 12:43:31 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:43:31 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676B41E.3050706@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > Chris Fields wrote: >> Couldn't you enable BIOPERLDEBUG, disable network access, then >> iterate through tests checking for those which fail or skip? > > Yes, good idea, though my dev machine is also my email/webserver so > I'd rather come up with an alternate solution than one involving > 'disable network access'. > > Still, that's what I'll probably end up doing. Cheers! > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > to wait for you to finish, or join in? If you're not going to have > time to do any more in the next few weeks, can you please update > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > in the opposite case, add your name in)? Its not quite clear to me > which tests are assigned to whom. Can someone clarify what the > markings mean? > > Cheers, > Sendu. Not sure how far along spiros is; I handed it over after I finished up to the 'Q' tests. In general the ones marked out have been converted over, ones with names next to them have been claimed. If you need help I'll prob. start back up again to finish them off; we just need to divy them up. chris From george.heller at yahoo.com Mon Jun 18 13:07:59 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com> What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. From jason at bioperl.org Mon Jun 18 13:53:28 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 10:53:28 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, > > relation "node" does not exist. > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > shift->throw_not_implemented(); > > Thanks. > George. > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > BioPerl doesn't have a Taxonomy::biosql module yet (though this would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download > to achieve what you wanted to do in a less than 5 lines of perl. > > Although the recursive implementation of Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > -hilmar > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > >> Thanks. And how can I assign the $node here in the below code, such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> Thanks. >> George >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> You just want the extant species/leaves of the tree >> >> >> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> Hi all, >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> Thanks. >> George >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> >> Any ideas on the way I can go about doing this? >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: > mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Mon Jun 18 18:10:00 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:10:00 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net> https is working fine for me for sf.net repositories, and I only have to enter the password upon first commit (since checkout doesn't even need a password). -hilmar On Jun 18, 2007, at 10:24 AM, Chris Fields wrote: > Not sure how Jason/Hilmar/Chris D. feel about https or supporting > both https+ssh -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 18:18:21 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com> I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. From hlapp at gmx.net Mon Jun 18 18:27:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:27:19 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: On Jun 18, 2007, at 1:07 PM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, Sorry, replace with "taxon". Jason answered the rest. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 18:33:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 17:33:40 -0500 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: > I tried running the below mentioned script and I seem to be getting > the following error: > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > My script looks something like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > And I am running the script using the command, > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > and I have the nodes.dmp and names.dmp files in the current > directory. > > Thanks, > George > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > > > -jason > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > relation "node" does not exist. > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > shift->throw_not_implemented(); > > > Thanks. > George. > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > -hilmar > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > Thanks. > George > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > You just want the extant species/leaves of the tree > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descedents; > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > Hi all, > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > Thanks. > George > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > > > Any ideas on the way I can go about doing this? > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 18 18:50:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:50:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> Message-ID: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 19:05:42 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com> This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. From jason at bioperl.org Mon Jun 18 19:22:08 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 16:22:08 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com> References: <706979.34648.qm@web56509.mail.re3.yahoo.com> Message-ID: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: > This is the output of /usr/bin/perl -V > > Summary of my perl5 (revision 5 version 8 subversion 5) configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- > linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- > E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > Thanks. > George > . > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > George, can you please post the output of > > $ /usr/bin/perl -V > > -hilmar > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > >> As the error implies your local version of perl doesn't seem support >> weak references, which means it doesn't have Scalar::Utils (which was >> added to core after perl 5.6.1, I think). Try installing >> Scalar::Utils to see what happens. >> >> chris >> >> On Jun 18, 2007, at 5:18 PM, George Heller wrote: >> >>> I tried running the below mentioned script and I seem to be getting >>> the following error: >>> >>> Weak references are not implemented in the version of perl at / >>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >>> Bio/Tree/Node.pm line 76. >>> Compilation failed in require at my.pl line 7. >>> BEGIN failed--compilation aborted at my.pl line 7. >>> >>> My script looks something like, >>> >>> #!/usr/bin/perl >>> use strict; >>> #use warnings; >>> use DBI; >>> use Bio::Tree::Node; >>> use Bio::DB::Taxonomy; >>> use Bio::DB::Taxonomy::flatfile; >>> my $idx_dir = '/tmp'; >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> foreach $field (@extant_children) { >>> print "$field"; >>> print "|"; >>> print "\n"; >>> } >>> >>> And I am running the script using the command, >>> >>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >>> >>> and I have the nodes.dmp and names.dmp files in the current >>> directory. >>> >>> Thanks, >>> George >>> >>> >>> Jason Stajich wrote: >>> It is implemented in the implementing class - DB::Taxonomy is >>> just the base class. For example see the flatfile implementation >>> Bio::DB::Taxonomy::flatfile >>> >>> See the scripts/taxa/local_taxonomydb_query.PLS for example using >>> it: >>> nodes and names are from NCBI taxonomy database. >>> >>> >>> Here is an un-debugged copy+paste for your question that *should* >>> work. >>> >>> >>> use Bio::DB::Taxonomy >>> my $idx_dir = '/tmp'; >>> >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> >>> >>> >>> -jason >>> >>> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >>> >>> What exactly is the "node n" in the query below. When I issue >>> this query, it says, >>> >>> >>> relation "node" does not exist. >>> >>> >>> I tried to use the get_all_Descendents method but it looks like >>> in order to do a recursive call it calls the method >>> each_Descendent. This method is not implemented in >>> Bio::DB::Taxonomy. It just has a single line, >>> >>> >>> shift->throw_not_implemented(); >>> >>> >>> Thanks. >>> George. >>> >>> >>> Hilmar Lapp wrote: >>> I'm a bit confused - it sounds like you have set up a local >>> BioSQL >>> database and loaded the NCBI taxonomy into the database. You can >>> now >>> use simple SQL to retrieve all descendants of a node in the tree >>> given its NCBI taxonID such as >>> >>> >>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >>> WHERE >>> n.ncbi_taxon_id = :taxonID >>> AND tn.left_value > n. left_value >>> AND tn.right_value < n.right_value >>> AND tn.taxon_id = tnm.taxon_id >>> AND tn.name_class = 'scientific_name' >>> >>> >>> BioPerl doesn't have a Taxonomy::biosql module yet (though this >>> would >>> seem like a worthwhile thing to add), so you can't use the >>> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >>> >>> >>> However, BioPerl does have support for the flat-file download of >>> the >>> NCBI taxonomy database and indexes it, so you can simply use >>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >>> download >>> to achieve what you wanted to do in a less than 5 lines of perl. >>> >>> >>> Although the recursive implementation of >>> Taxonomy::get_all_Descendants >>> () won't be lightning fast, it may still be perfectly fine for your >>> application - are you sure it is not? >>> >>> >>> -hilmar >>> >>> >>> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >>> >>> >>> Thanks. And how can I assign the $node here in the below code, >>> such >>> that I can reference it to a particular taxon id record? I want to >>> retrieve all the descendents from the taxonomy hierarchy, given a >>> particular taxon id. >>> >>> >>> I have a local db setup, in which I have uploaded data using the >>> load_ncbi_taxonomy.pl script. >>> >>> >>> Thanks. >>> George >>> >>> >>> Jason Stajich wrote: >>> I assume you already figured out how to setup a local taxonomydb? >>> >>> >>> >>> >>> You just want the extant species/leaves of the tree >>> >>> >>> >>> >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descedents; >>> >>> >>> >>> >>> >>> >>> -jason >>> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >>> >>> >>> Hi all, >>> >>> >>> >>> >>> Can anyone point me to some example that uses the >>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >>> this, and I am not quite sure how to implement it. >>> >>> >>> >>> >>> Thanks. >>> George >>> >>> >>> >>> >>> Sendu Bala wrote: >>> George Heller wrote: >>> Hi all, >>> >>> >>> >>> >>> I am looking at extracting the taxonomy hierarchy for some taxon >>> ids. >>> What I plan to do is, for a given taxon id, say 33090, I want to >>> extract all taxon ids that are children of this species. I do not >>> just want the immediate children, but the children's children >>> and so >>> on. >>> >>> >>> >>> >>> Any ideas on the way I can go about doing this? >>> >>> >>> >>> >>> Well, you'll use Bio::DB::Taxonomy presumably, and >>> each_Descendent in >>> some kind of looping structure. Most easily a recursing sub. >>> >>> >>> >>> >>> If you happen to code up something neat and efficient, why not >>> share it >>> with us and we could add it to the Taxonomy module(s). >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Shape Yahoo! in your own image. Join our Network Research Panel >>> today! >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Need a vacation? Get great deals to amazing places on Yahoo! >>> Travel. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Take the Internet to Go: Yahoo!Go puts the Internet in your >>> pocket: mail, news, photos & more. >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Bored stiff? Loosen up... >>> Download and play hundreds of games for free on Yahoo! Games. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Mon Jun 18 20:04:00 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com> Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. From jason at bioperl.org Mon Jun 18 20:17:34 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 17:17:34 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com> References: <424035.72876.qm@web56507.mail.re3.yahoo.com> Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: > Ok, I installed the latest of Scalar::Util and the script seems to > be working. But I am confused where exactly I need to look for the > descendent taxon ids once the script is run. I did look into the / > tmp/ directory, but I couldnt understand much. > > Sorry to be bothering, really appreaciate your patience. > > Thanks. > George > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > This is the output of /usr/bin/perl -V > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > Thanks. > George > . > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > > George, can you please post the output of > > > $ /usr/bin/perl -V > > > -hilmar > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils (which > was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > chris > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > My script looks something like, > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > And I am running the script using the command, > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > Thanks, > George > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > -jason > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > relation "node" does not exist. > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > shift->throw_not_implemented(); > > > > > Thanks. > George. > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > > > -hilmar > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > Thanks. > George > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > Hi all, > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > Thanks. > George > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Mon Jun 18 20:29:31 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com> But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. From jason at bioperl.org Mon Jun 18 21:05:43 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 18:05:43 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: > But the problem is that I don't really get any output on the > screen. In the /tmp directory I get 4 files namely parents, nodes, > id2names and names2id, but I dont know what to make of them. This > is what my script looks like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > my $nodefile; > my $namesfile; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodefile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > Thanks. > George > > Jason Stajich wrote: > All the children are in this array. > > > You get to decide what you want to do with them. In the following > example I print the id, rank, and scientific name out to the screen. > Because this is a taxonomy db query you are getting back > Bio::Taxonomy::Taxon objects so read the documentation for this > module to see what you can do with the object. > I would also suggest spending a little time with the Getting > started and HOWTO:Trees documentation on the website to get > familiar with the objects and nomenclature. > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > > On Jun 18, 2007, at 5:04 PM, George Heller wrote: > > Ok, I installed the latest of Scalar::Util and the script seems > to be working. But I am confused where exactly I need to look for > the descendent taxon ids once the script is run. I did look into > the /tmp/ directory, but I couldnt understand much. > > > Sorry to be bothering, really appreaciate your patience. > > > Thanks. > George > > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > > This is the output of /usr/bin/perl -V > > > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat > 3.4.6-2)', gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > > > Thanks. > George > . > > > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something > strange > appears to be going on too. > > > > > George, can you please post the output of > > > > > $ /usr/bin/perl -V > > > > > -hilmar > > > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils > (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > > > chris > > > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ > 5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > > > My script looks something like, > > > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > > > And I am running the script using the command, > > > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > > > Thanks, > George > > > > > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > > > > > > > > > -jason > > > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > > > > > relation "node" does not exist. > > > > > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > > > > > shift->throw_not_implemented(); > > > > > > > > > Thanks. > George. > > > > > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for > your > application - are you sure it is not? > > > > > > > > > -hilmar > > > > > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > > > > > Thanks. > George > > > > > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a > newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > > > > > > > > > Thanks. > George > > > > > > > > > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s > user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From torsten.seemann at infotech.monash.edu.au Mon Jun 18 21:21:04 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:21:04 +1000 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: Sendu, > >> Can anyone offer a > >> way to systematically find at least the test scripts which access the > >> internet, if not the specific tests within? Perhaps you could use 'strace' to list network system calls for each test script, and grep out AF_INET connections? % strace -e trace=network command_to_test 2>&1 | grep AF_INET I'm not an strace expert but it might do what you need. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From george.heller at yahoo.com Mon Jun 18 21:16:10 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com> Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help! Thanks. George Jason Stajich wrote: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. From torsten.seemann at infotech.monash.edu.au Mon Jun 18 21:26:41 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:26:41 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: (Sean, please reply to the bioperl-l list rather than to me personally so everyone can read it. i'm reposting it here) > > I posted this on the gbrowse list earlier. I'm looking to convert gff > > data files into xml. Does anyone know of a module written to do this > > already? > > What DTD do you want the XML to conform to? > eg. ChadoXML, TinySeq XML, TIGR XML ... ? Hi Torsten, I'm collaborating with other groups and want web-service compatible functionality for various tools. Normally the analysis tools I'm using generate gff output. I'm going to have to wrap this output in XML with XSL stylesheet for end-users to view. Haven't done it before and don't know what DTD to use. The bp_seqconvert.pl doesn't accept gff format. I would imagine the DTD would be quite short as the gff files are very standard, I just don't have any experience with these DTD requirements. --Sean O'Keeffe From sac at bioperl.org Tue Jun 19 02:42:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 18 Jun 2007 23:42:27 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> On 6/16/07, Jason Stajich wrote: > [...] > Just to say I already went through all the steps of running cvs2svn > myself and had problems gathering back out the branches and all the > tags when I tried it. If you want to start with a smaller repository > like bioperl-network or bioperl-db as the initial cvs2svn conversion > script took quite a long time to run on bioperl-live. Might this been a good opportunity to investigate partitioning bioperl-live into sub-repositories? There has been talk in the past of defining a set of "core" modules separate from other functionally related groups of modules that would be viewed as optional extensions. The goal being to help manage growth and simplify releases. There are currently 892 modules under Bio/. In addition to simplifying the migration to SVN, it would also have other benefits. Say some new functionality or a slew of fixes were added to Bio::Graphics. We could turn around a new Bio::Graphics release quickly without having to work on getting various other parts up to snuff that aren't related to graphics (Biblio, DB, PopGen, Search etc.). Maintenance and releases of the various extensions would be more parallelizable, orchestrated by separate ring leaders. Over time, as a set of functionality matures, it would see fewer updates and there would be less of a need for users to download/install/test it. This could make bioperl easier to customize, extend, and grok in general. Long term, it should ease development and release cycles, but it will involve a bit of near term bullet-biting. We'd need to get clear on how to partition things, including modules, tests, docs, installation logic, etc. and we'd probably need new integration tests to verify that the subsets continue working together. What do folks think? Would this SVN-based, re-partitioned bioperl-live constitute a 2.0 release? Any volunteers to help assemble a roadmap and milestones? Should I go on dreaming? Cheers, Steve From bix at sendu.me.uk Tue Jun 19 03:01:05 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:01:05 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: <46777F31.7030402@sendu.me.uk> Jason Stajich wrote: > The reason it isn't printing anything is someone didn't really write > the implementation quite right. This code was overhauled by Sendu > before the last release I guess something didn't quite get connected. > > I checked in code that has the Bio::Taxon delegating now to a DB > handle for the each_Descendent call. > You can either patch your code or just use the code listed here: > http://bioperl.org/wiki/Module:Bio::DB::Taxonomy I've reverted that change. For some reason the docs for Bio::Taxon::each_Descendent aren't showing up on the website, but they state: --- Note that this method never asks the database for the descendents; it will only return objects you have manually set with add_Descendent(), or where this was done for you by making a Bio::Tree::Tree with this object as an argument to new(). To get the database descendents use $taxon->db_handle->each_Descendent($taxon). --- I also have a note in the Synopsis for the module: --- # Though be careful with each_Descendent - unless you add_Descendent() # yourself, you won't get an answer because unlike for ancestor(), # Bio::Taxon does not ask the database for the answer. You can ask the # database yourself using the same method: ($human) = $homo->db_handle->each_Descendent($homo); --- This is quite deliberate and is to prevent Bad Things from happening. (Can't exactly remember the reasoning now, but I know it was good.) From bix at sendu.me.uk Tue Jun 19 03:41:57 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:41:57 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <467788C5.6070406@sendu.me.uk> Steve Chervitz wrote: > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? There has been talk in the past of > defining a set of "core" modules separate from other functionally > related groups of modules that would be viewed as optional extensions. > The goal being to help manage growth and simplify releases. There are > currently 892 modules under Bio/. > > In addition to simplifying the migration to SVN, it would also have > other benefits. Say some new functionality or a slew of fixes were > added to Bio::Graphics. We could turn around a new Bio::Graphics > release quickly without having to work on getting various other parts > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > Search etc.). Maintenance and releases of the various extensions would > be more parallelizable, orchestrated by separate ring leaders. > > Over time, as a set of functionality matures, it would see fewer > updates and there would be less of a need for users to > download/install/test it. This could make bioperl easier to customize, > extend, and grok in general. > > Long term, it should ease development and release cycles I actually take the opposite view. Breaking things up makes testing and releases more difficult. If one person acts as pumpkin for all the sub-parts, his work-load increases almost linearly with the number of sub-parts. If each sub-part gets its own pumpkin, where do all these pumpkins come from? It seems to me that frequently authors will write modules but inevitably their circumstance changes and they can no longer devote the time to look after them. Having a single pumpkin and 'forcing' him to make sure everything works (regardless of his personal interest in the module) seems more reliable than hoping there will be a person interested enough in each sub-part to handle its release. Since all sub-parts will at the least interact with the 'true' core set of Bioperl modules, they need to be tested and potentially re-released every time the true core is updated. And since some sub-parts will interact with other sub-parts, there will need to be coordinated joint-testing and release of multiple sub-parts. What happens when users report problems? We ask them what version they're running. Right now '1.5.2' means a specific thing, and its trivial for someone to confirm the same problem by installing 1.5.2. What happens when users have to list out all the versions of all the sub-parts they have? Who is going to consistently recreate a users hodge-podge of versions in order to confirm a bug? Won't the advice instead be: "update all versions to the latest and get back to us"? So, as I see it, all sub-parts would best be tested and released with a single new version number every time one sub-part is updated (significantly). In which case, why have sub-parts at all? Keeping things the way they are now means ease of release for the pumpkin and ease of installation for end-users (only one install command to issue to CPAN). Having 'true' sub-parts (each with its own pumpkin), in my fatalistic view, is just going to lead to some useful sub-parts being abandoned and never updated, even where updates may be desirable. Each and every Bio:: module could have been released separately by its respective author. As I see it, one of the main values of 'Bioperl' is that its one (reasonably) consistent collection of modules that lowers the barrier of entry for new Bioinformaticians, giving them extremely easy access to a whole host of functionality with a single install. From hlapp at gmx.net Tue Jun 19 08:47:02 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 08:47:02 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46777F31.7030402@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> So the real mistake was to write my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; instead of my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents ($node); I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the database? If this is correct, can we highlight this in the documentation? It's a small difference that everyone failed to spot. If it is not correct, then maybe we need to revisit the rationale for why a Bio::DB::Taxonomy::get_all_Descendents may not query the underlying database. Also, in my reading of Bio::Taxonomy::Taxon it won't use the database either for ancestor(). Which would be consistent with its other methods. I.e., the bottom line is don't use Node or Taxon objects for hierarchy queries that you expect to use an underlying database, use the Bio::DB::Taxonomy object instead. It makes sense, but is it true? -hilmar On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote: > Jason Stajich wrote: >> The reason it isn't printing anything is someone didn't really write >> the implementation quite right. This code was overhauled by Sendu >> before the last release I guess something didn't quite get connected. >> >> I checked in code that has the Bio::Taxon delegating now to a DB >> handle for the each_Descendent call. >> You can either patch your code or just use the code listed here: >> http://bioperl.org/wiki/Module:Bio::DB::Taxonomy > > I've reverted that change. > > For some reason the docs for Bio::Taxon::each_Descendent aren't > showing > up on the website, but they state: > > --- > Note that this method never asks the database for the descendents; it > will only return objects you have manually set with add_Descendent > (), or > where this was done for you by making a Bio::Tree::Tree with this > object > as an argument to new(). > > To get the database descendents use > $taxon->db_handle->each_Descendent($taxon). > --- > > > I also have a note in the Synopsis for the module: > > --- > # Though be careful with each_Descendent - unless you add_Descendent() > # yourself, you won't get an answer because unlike for ancestor(), > # Bio::Taxon does not ask the database for the answer. You can ask the > # database yourself using the same method: > ($human) = $homo->db_handle->each_Descendent($homo); > --- > > > This is quite deliberate and is to prevent Bad Things from happening. > (Can't exactly remember the reasoning now, but I know it was good.) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Tue Jun 19 09:05:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca> > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated. From bix at sendu.me.uk Tue Jun 19 10:25:26 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 15:25:26 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> Message-ID: <4677E756.6050200@sendu.me.uk> Hilmar Lapp wrote: > So the real mistake was to write > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; > > instead of > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents > ($node); > > I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the > database? Yes, the database object methods use the database. I don't even think it makes sense to question that. What else would it do? > If this is correct, can we highlight this in the documentation? It's > a small difference that everyone failed to spot. The documentation for what? I've already clearly pointed out the gotcha in Bio::Taxon. > Also, in my reading of Bio::Taxonomy::Taxon it won't use the database > either for ancestor(). Which would be consistent with its other methods. Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing with, and it /does/ use the db to get the ancestor, unless the ancestor is manually set (see below for explanation). > I.e., the bottom line is don't use Node or Taxon objects for > hierarchy queries that you expect to use an underlying database, use > the Bio::DB::Taxonomy object instead. It makes sense, but is it true? Almost. It happens to be true but ideally wouldn't be the case. The confusion and problems arise, I guess, because we have two ways to access/create hierarchies and both of them are built from the same building block (Bio::Taxon objects). On the one hand we have Bio::DB::Taxonomy and the other we have Bio::Tree::Tree. Tree objects are easy: you have a Taxon object created in memory for each and every node in the tree. Each Taxon knows its ancestor and descendants by storing references to the relevant Taxon objects in the tree. You 'navigate' through the tree by grabbing a Taxon inside it and asking the Taxon itself for its ancestor or descendant. This leaves us with the Taxon object having the methods ancestor() and each_Descendent(), which we'll expect to work in other circumstances. Bio::DB::Taxonomy returns single Taxon objects from the database on request. Now we still expect our ancestor() and each_Descendent() methods to work, but if things were set up like Bio::Tree::Tree we'd end up pulling the entire database into memory because we'd have to create all the Taxon objects that are ancestors and descendants, recursively, every time we request a single Taxon (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of Bio::DB::Taxonomy::entrez). The solution? We simply don't create the immediate ancestor or descendant Taxon objects of the requested Taxon, and instead implement the Taxon methods to ask the database to create them on demand, if they don't already exist. Well, that idea is fine (and necessary) for the ancestor method, but we run into problems with each_Descendent(). The problem arises when we create Bio::Tree::Tree objects from a Taxon we got from the database. Being able to do that is why Bio::Taxon is shared between them, as it is a very desirable thing to do: you can instantly create a lineage tree for a Taxon of interest and then use all the Bio::Tree::Tree methods on it. Unfortunately one of those methods is get_nodes() which is implemented using each_Descendent() and get_all_Descendents(). If each_Descendent() asked the database for the real answer, we'd end up pulling the entire database into the tree. So my implementation was to not ask the database and just warn people in the docs. Ideally it /would/ use the database, because that's what a user would expect. Can anyone see an alternate way around the problem? From hlapp at gmx.net Tue Jun 19 12:14:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 12:14:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <4677E756.6050200@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: Sorry I was accidentally looking at an older branch. Reading through the Taxon module I get more confused though than would leave me at ease. Here's what I understand of your description of the problem: - We would like nodes returned from Bio::DB::Taxonomy to use the database for all hierarchical queries. - We would like nodes used in a Bio::Tree::Tree not to use the database for any hierarchical query. What I understand that we have is - Taxon node objects that have a db_handle set will use the database for ancestor(), unless it has been set manually (?), but not for each_Descendent(). - Taxon node objects that don't have a db_handle set won't use a database but will function normally otherwise. - This is needed to prevent Bio::Tree::Tree methods from pulling the entire tree into memory. If this is correct (I'm not sure it is), it sounds like we want to temporarily divorce taxonomy nodes from their database capabilities while they are being queried in a tree context? I'm still trying to understand - if I create a Bio::Tree::Tree from a single node, will the tree automatically contain all nodes along the lineage of ancestors up to the root? So, even if extracting this lineage involved querying a database it would be acceptable, but not for querying descendents? It sounds to me like what is needed is that nodes that get added to a tree need to be stripped of their database capabilities. This could be achieved by creating a wrapper class that delegates all non- hierarchical methods to the wrapped Taxon object, and overriding all hierarchical queries to not use a database. I'm not sure I fully understand yet though, but the inconsistent behavior will be sure to throw people off track. -hilmar On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> So the real mistake was to write >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >> >get_all_Descendents; >> instead of >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $db- >> >get_all_Descendents ($node); >> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask >> the database? > > Yes, the database object methods use the database. I don't even > think it makes sense to question that. What else would it do? > > >> If this is correct, can we highlight this in the documentation? >> It's a small difference that everyone failed to spot. > > The documentation for what? I've already clearly pointed out the > gotcha in Bio::Taxon. > > >> Also, in my reading of Bio::Taxonomy::Taxon it won't use the >> database either for ancestor(). Which would be consistent with >> its other methods. > > Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're > dealing with, and it /does/ use the db to get the ancestor, unless > the ancestor is manually set (see below for explanation). > > >> I.e., the bottom line is don't use Node or Taxon objects for >> hierarchy queries that you expect to use an underlying database, >> use the Bio::DB::Taxonomy object instead. It makes sense, but is >> it true? > > Almost. It happens to be true but ideally wouldn't be the case. The > confusion and problems arise, I guess, because we have two ways to > access/create hierarchies and both of them are built from the same > building block (Bio::Taxon objects). > > On the one hand we have Bio::DB::Taxonomy and the other we have > Bio::Tree::Tree. > > Tree objects are easy: you have a Taxon object created in memory > for each and every node in the tree. Each Taxon knows its ancestor > and descendants by storing references to the relevant Taxon objects > in the tree. You 'navigate' through the tree by grabbing a Taxon > inside it and asking the Taxon itself for its ancestor or descendant. > > This leaves us with the Taxon object having the methods ancestor() > and each_Descendent(), which we'll expect to work in other > circumstances. > > Bio::DB::Taxonomy returns single Taxon objects from the database on > request. Now we still expect our ancestor() and each_Descendent() > methods to work, but if things were set up like Bio::Tree::Tree > we'd end up pulling the entire database into memory because we'd > have to create all the Taxon objects that are ancestors and > descendants, recursively, every time we request a single Taxon > (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and > slow/not allowed in the case of Bio::DB::Taxonomy::entrez). > > The solution? We simply don't create the immediate ancestor or > descendant Taxon objects of the requested Taxon, and instead > implement the Taxon methods to ask the database to create them on > demand, if they don't already exist. Well, that idea is fine (and > necessary) for the ancestor method, but we run into problems with > each_Descendent(). > > The problem arises when we create Bio::Tree::Tree objects from a > Taxon we got from the database. Being able to do that is why > Bio::Taxon is shared between them, as it is a very desirable thing > to do: you can instantly create a lineage tree for a Taxon of > interest and then use all the Bio::Tree::Tree methods on it. > Unfortunately one of those methods is get_nodes() which is > implemented using each_Descendent() and get_all_Descendents(). If > each_Descendent() asked the database for the real answer, we'd end > up pulling the entire database into the tree. > > So my implementation was to not ask the database and just warn > people in the docs. Ideally it /would/ use the database, because > that's what a user would expect. Can anyone see an alternate way > around the problem? -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cain.cshl at gmail.com Tue Jun 19 14:41:52 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Tue, 19 Jun 2007 14:41:52 -0400 Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug? In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL> References: <18039.61086.829726.809888@gargle.gargle.HOWL> Message-ID: <1182278512.2592.42.camel@localhost.localdomain> Hi Alessandra, I cc'ed your message to the bioperl and sequence ontology mailing lists, since your question is relevant to both. Converting genbank files to GFF3 is excruciatingly difficult; I generally find that I can use the genbank2gff3 script to get me most of the way there, but then I need to do some manual fixing to make it 'right'. I am using bioperl-live, since there have been several fixes to the script since bioperl 1.5.2 was released, including the most recent fixes from me today (when I started working on this); I would suggest you use bioperl-live as well. I ran the script on chrY. Most (perhaps all) of the errors fit into a few categories: - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have a phase. Since it can be a little bit of a hassle to calculate, I understand why it was left out, but I'll submit a bug report to have those calculated. If you are planning on loading the GFF file into Chado, you can use the --noCDS option to get exons instead of CDSes, which makes the problem go away (the validator has a bug here though--it reports the polypeptide derives_from mRNA as invalid, but it is correct; I'm reporting that directly to the author). Here's the bioperl bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2322 - "invalid type pair" is caused by the genbank file using feature types in a way that conflicts with the Sequence Ontology. For example, it has STS features that are part_of a gene, pseudogenic_region as part_of pseudogene. I don't know if there would be an easy way to catch this in the conversion script. You may need to fix these by hand. If the problems occur for features that you don't care about, you can use the --filter option to leave them out of the resulting GFF file (for example, adding '--filter STS' would leave all STS features out of the file). Also, if you don't plan on loading these into Chado (which does require SO-compliance) but instead plan on using a Bio::DB::SeqFeature database, these errors may not be a problem. - "invalid type" is caused by feature types that are not in SOFA (Sequence Ontology for Feature Annotation), though the terms probably are in SO. I thought at one point we discussed allowing any SO type to appear in the GFF3 type column, but that is not what the spec says now. I don't see this type of error as causing a problem for either Bio::DB::SeqFeature or Chado. Chado allows features to be typed with anything that is in SO and does not restrict to SOFA. Scott On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote: > Hi all, > > I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about > human genbank file. I used validate_gff3 on line with human.gff and > it has id non-unique so the database gbrowse inserting has errors. > > I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that > I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens > Elements having id non-unique are: > - CDS or pseudo*exon without mRNA and parent > - STS with egual start and end > - tRNA with egual name > > If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl? > If I'm mistaken, can you help me? > > Thanks very much for the help in advance, > > Alessandra. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sac at bioperl.org Tue Jun 19 14:54:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Tue, 19 Jun 2007 11:54:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <467788C5.6070406@sendu.me.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Valid points, Sendu. I wonder if there might be a best-of-both-worlds approach here. I would not be advocating for a major slice and dice, but just identifying a few large, reasonably well established and encapsulated blocks of functionality that could be managed more independently and segregating them away from the rest. For example: DB, Graphics, Search+SearchIO, Tools. Once per year, we could have a "whole caboodle" release where the core and all sub parts are tested and released as a group, as we currently do. Then, updates to the sub parts can occur as-needed but without necessarily involving updates to other sub parts or the core. The onus would be on the pumpkin for the sub part release to make sure it continues to work with the last whole caboodle release. This would minimize the number of release clashes, since sub part updates would only be sanctioned relative to the last caboodle release, and it would ensure that the whole set continues to interoperate. Perhaps it would be worth experimenting with such an approach so we can judge it based on actual experience. We could identify one functional sub part and segregate it out, do a release cycle or two, along with a sub part release, and decide if this makes things easier or harder, for devs as well as users. We could always bring it back into the fold if it doesn't work out. My fear is that as bioperl continues to grow, the monolithic approach will become increasingly onerous for a single release pumpkin to manage, and harder to find someone who feels up to the task. It could also discourage new developers from diving into the codebase if it looks too deep. And they are our lifeblood. A more functionally segregated bioperl codebase could lower the activation energy needed to recruit release pumpkins and new devs, leading to more release iterations, fewer bugs, more features, and more sustainable growth. When I first discovered Bioperl in 1996, it had three modules. At ~900, I probably wouldn't have joined ranks as a developer (well, I probably would, but it would have taken a while to digest it and become a contributor). Steve On 6/19/07, Sendu Bala wrote: > Steve Chervitz wrote: > > Might this been a good opportunity to investigate partitioning > > bioperl-live into sub-repositories? There has been talk in the past of > > defining a set of "core" modules separate from other functionally > > related groups of modules that would be viewed as optional extensions. > > The goal being to help manage growth and simplify releases. There are > > currently 892 modules under Bio/. > > > > In addition to simplifying the migration to SVN, it would also have > > other benefits. Say some new functionality or a slew of fixes were > > added to Bio::Graphics. We could turn around a new Bio::Graphics > > release quickly without having to work on getting various other parts > > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > > Search etc.). Maintenance and releases of the various extensions would > > be more parallelizable, orchestrated by separate ring leaders. > > > > Over time, as a set of functionality matures, it would see fewer > > updates and there would be less of a need for users to > > download/install/test it. This could make bioperl easier to customize, > > extend, and grok in general. > > > > Long term, it should ease development and release cycles > > I actually take the opposite view. Breaking things up makes testing and > releases more difficult. > > If one person acts as pumpkin for all the sub-parts, his work-load > increases almost linearly with the number of sub-parts. If each sub-part > gets its own pumpkin, where do all these pumpkins come from? It seems to > me that frequently authors will write modules but inevitably their > circumstance changes and they can no longer devote the time to look > after them. Having a single pumpkin and 'forcing' him to make sure > everything works (regardless of his personal interest in the module) > seems more reliable than hoping there will be a person interested enough > in each sub-part to handle its release. > > Since all sub-parts will at the least interact with the 'true' core set > of Bioperl modules, they need to be tested and potentially re-released > every time the true core is updated. And since some sub-parts will > interact with other sub-parts, there will need to be coordinated > joint-testing and release of multiple sub-parts. > > What happens when users report problems? We ask them what version > they're running. Right now '1.5.2' means a specific thing, and its > trivial for someone to confirm the same problem by installing 1.5.2. > What happens when users have to list out all the versions of all the > sub-parts they have? Who is going to consistently recreate a users > hodge-podge of versions in order to confirm a bug? Won't the advice > instead be: "update all versions to the latest and get back to us"? > > So, as I see it, all sub-parts would best be tested and released with a > single new version number every time one sub-part is updated > (significantly). In which case, why have sub-parts at all? Keeping > things the way they are now means ease of release for the pumpkin and > ease of installation for end-users (only one install command to issue to > CPAN). Having 'true' sub-parts (each with its own pumpkin), in my > fatalistic view, is just going to lead to some useful sub-parts being > abandoned and never updated, even where updates may be desirable. > > Each and every Bio:: module could have been released separately by its > respective author. As I see it, one of the main values of 'Bioperl' is > that its one (reasonably) consistent collection of modules that lowers > the barrier of entry for new Bioinformaticians, giving them extremely > easy access to a whole host of functionality with a single install. > From bix at sendu.me.uk Tue Jun 19 15:13:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:13:39 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <46782AE3.2090703@sendu.me.uk> Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. [snip] You haven't convinced me, but I'd go along with the majority decision if best-of-both-worlds was picked. > DB, Graphics, Search+SearchIO, Tools. I will, however, say that DB interleaves into too many core modules. It should stay in core. Tools? Its hardly touched anyway, so I don't see the value of taking it out, what with Bio::Tools::Run already being its own package. Most Bioperl users probably get Bioperl just to do something Blast related, so all Blast stuff really ought to stay in core. Graphics is an obvious choice and I agree. Updated frequently, and has its own release needs. It also has some of the trickier dependencies, so would make installing core simpler. I can imagine plucking Search+SearchIO out, and its something that needs regular updating. Another good candidate. > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. Well, we already have the run package. Its a split-off subpart that gets updated. The only 'experiment' left to do is finding it its own pumpkin. From bix at sendu.me.uk Tue Jun 19 15:48:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:48:50 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: <46783322.30309@sendu.me.uk> Hilmar Lapp wrote: > Here's what I understand of your description of the problem: > > - We would like nodes returned from Bio::DB::Taxonomy to use the > database for all hierarchical queries. > > - We would like nodes used in a Bio::Tree::Tree not to use the > database for any hierarchical query. Correct. > What I understand that we have is > > - Taxon node objects that have a db_handle set will use the database > for ancestor(), unless it has been set manually (?), but not for > each_Descendent(). > > - Taxon node objects that don't have a db_handle set won't use a > database but will function normally otherwise. > > - This is needed to prevent Bio::Tree::Tree methods from pulling the > entire tree into memory. Correct. > If this is correct (I'm not sure it is), it sounds like we want to > temporarily divorce taxonomy nodes from their database capabilities > while they are being queried in a tree context? Yes. > I'm still trying to understand - if I create a Bio::Tree::Tree from a > single node, will the tree automatically contain all nodes along the > lineage of ancestors up to the root? So, even if extracting this > lineage involved querying a database it would be acceptable, but not > for querying descendents? Yes. Asking the database for all the ancestors up to root only pulls a couple of nodes into the tree and is exactly what the user would want to happen. But if nodes are allowed to get their descendants from the database, when we get the root node from the database, we'd get all the root's descendants, and then for each of those we'd get all /their/ descendants... that's when the whole db gets sucked in. > It sounds to me like what is needed is that nodes that get added to a > tree need to be stripped of their database capabilities. This could > be achieved by creating a wrapper class that delegates all non- > hierarchical methods to the wrapped Taxon object, and overriding all > hierarchical queries to not use a database. I'm not sure I fully > understand yet though, but the inconsistent behavior will be sure to > throw people off track. When we're making a tree from a db Taxon we need db access to find all the ancestors; we just don't want to get any descendants outside our initiating Taxon's direct lineage. my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens'); my @ranks = qw(superkingdom class order genus species); my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names, -ranks => \@ranks); @names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus'); $db->add_lineage(-names => \@names, -ranks => \@ranks); my $homo = $db->get_taxon(-name => 'Homo'); isa_ok($homo, 'Bio::Taxon'); # PASS is $homo->ancestor->scientific_name, 'Primates' # PASS my @descs = $homo->each_Descendent; is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node my $lineage = Bio::Tree::Tree->new(-node => $homo); is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS my @nodes = $lineage->get_nodes; ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8 (on that last test I can't remember if the answer might actually be 5 because our lineage does contain 'Homo sapiens') If anyone can figure out how to get all those to pass, please let me know. From cjfields at uiuc.edu Tue Jun 19 17:15:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 16:15:00 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. I would not be advocating for a major slice and dice, > but just identifying a few large, reasonably well established and > encapsulated blocks of functionality that could be managed more > independently and segregating them away from the rest. For example: > DB, Graphics, Search+SearchIO, Tools. There should also be a consensus between the core devs on this; I don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing their opinions as it will directly impact projects which rely on core functionality (GBrowse/GMOD, bioperl-db, etc). I also agree with George that this should be postponed until after svn issues are taken care of. Stating that, I think this is a good idea in general, though we'll need to be careful which ones we segregate out as non-core. I agree with your choices; I would add in Bio::Restriction, Bio::Assembly, Bio::Structure, and a few more. As long as the distribution required installation of 'core' prior to test runs it shouldn't be too much of a problem. In order for this to work we would need to delineate what defines 'core' (how broad the definition should be), then identify those modules that don't fit and decide what to do with them. Would we want to split the others into separate packages or lump together as a bioperl-auxiliary (horrid name, but you get my point)? Too many could be a logistical nightmare, as Sendu has pointed out. > Once per year, we could have a "whole caboodle" release where the core > and all sub parts are tested and released as a group, as we currently > do. Then, updates to the sub parts can occur as-needed but without > necessarily involving updates to other sub parts or the core. Sounds fine by me. Actually, my thought was we could reimplement Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted) to install all the necessary subpackages in order to emulate an old- style 'core' installation, or act as an 'install everything BioPerl- related' Bundle. Regular updates of the subpackages to CPAN should just require updating the Bundle (which would update only the relevant parts, at least I believe it would). > The onus would be on the pumpkin for the sub part release to make sure > it continues to work with the last whole caboodle release. This would > minimize the number of release clashes, since sub part updates would > only be sanctioned relative to the last caboodle release, and it would > ensure that the whole set continues to interoperate. > > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. We could always bring it back > into the fold if it doesn't work out. > > My fear is that as bioperl continues to grow, the monolithic approach > will become increasingly onerous for a single release pumpkin to > manage, and harder to find someone who feels up to the task. It could > also discourage new developers from diving into the codebase if it > looks too deep. And they are our lifeblood. Agreed! > A more functionally segregated bioperl codebase could lower the > activation energy needed to recruit release pumpkins and new devs, > leading to more release iterations, fewer bugs, more features, and > more sustainable growth. 'Activation energy.' Hmm. Spoken like a true biologist. > When I first discovered Bioperl in 1996, it had three modules. At > ~900, I probably wouldn't have joined ranks as a developer (well, I > probably would, but it would have taken a while to digest it and > become a contributor). > > Steve I pretty much agree, though this will require quite a bit more discussion. chris From hlapp at gmx.net Tue Jun 19 17:57:54 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 17:57:54 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > There should also be a consensus between the core devs on this; I > don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing > their opinions The problem I have increasingly had with BioPerl (aside from the fact that it's written in Perl ;) is the plethora of dependencies I need to install, not the number of modules. But every time I've been told that that's what Perl is all about, and I should shut up and install the bundle. Idiosyncratically I don't like bundles that clutter up my hard disk with stuff I'll never use, and in this sense if BioPerl is divided into 10 packages I will have to think about each one whether I need it, and do a separate CVS checkout - and regular update - of each one (though granted, I believe there are ways the multiple checkout and update thing can be taken care of). In reality, this may be a rapidly disappearing trait though of those who have grown up in a time when they proudly spent all their savings to buy that new computer because it had a 20MB hard disk, compared to the two 360k floppy drives the previous one had. So don't ask me, just don't make it too hard for the dinosaurs. > as it will directly impact projects which rely on core > functionality (GBrowse/GMOD, bioperl-db, etc). Well, I hope there are ways to limit that? > I also agree with George that this should be postponed until after > svn issues are taken care of. I agree entirely. Please don't throw this in the same bin or tie one to the other. The migration is neither easier nor faster nor better testable with a partitioned BioPerl. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Jun 19 21:48:20 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 20:48:20 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > >> There should also be a consensus between the core devs on this; I >> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >> their opinions > > The problem I have increasingly had with BioPerl (aside from the fact > that it's written in Perl ;) is the plethora of dependencies I need > to install, not the number of modules. > > But every time I've been told that that's what Perl is all about, and > I should shut up and install the bundle. Idiosyncratically I don't > like bundles that clutter up my hard disk with stuff I'll never use, > and in this sense if BioPerl is divided into 10 packages I will have > to think about each one whether I need it, and do a separate CVS > checkout - and regular update - of each one (though granted, I > believe there are ways the multiple checkout and update thing can be > taken care of). I agree; the fewer dependencies the better. We could divide it up into a small, focused core package with only a few dependencies, and 1-3 more containing the focused bits which require the most maintenance (Graphics, SearchIO/Tools, etc). I worry about having too many more. > In reality, this may be a rapidly disappearing trait though of those > who have grown up in a time when they proudly spent all their savings > to buy that new computer because it had a 20MB hard disk, compared to > the two 360k floppy drives the previous one had. > > So don't ask me, just don't make it too hard for the dinosaurs. There would need to be some way of getting an old-style full-blown core installation regardless of how many subdistros we would divy core up into. My thought for CPAN was having Bundle::BioPerl take over this but I'm not sure if it's still being used. Maybe there are other ways for svn/cvs. >> as it will directly impact projects which rely on core >> functionality (GBrowse/GMOD, bioperl-db, etc). > > Well, I hope there are ways to limit that? I believe so, yes, particularly for bioperl-db. I would think splitting off Bio::Graphics or Bio::DB* will have some effect on GBrowse/GFF. >> I also agree with George that this should be postponed until after >> svn issues are taken care of. > > I agree entirely. Please don't throw this in the same bin or tie one > to the other. The migration is neither easier nor faster nor better > testable with a partitioned BioPerl. > > -hilmar We def. have to complete transition to subversion first, then think about this some more. chris From n.haigh at sheffield.ac.uk Wed Jun 20 02:31:24 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 07:31:24 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: <4678C9BC.10206@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > >> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: >> >>> There should also be a consensus between the core devs on this; I >>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >>> their opinions >> The problem I have increasingly had with BioPerl (aside from the fact >> that it's written in Perl ;) is the plethora of dependencies I need >> to install, not the number of modules. >> >> But every time I've been told that that's what Perl is all about, and >> I should shut up and install the bundle. Idiosyncratically I don't >> like bundles that clutter up my hard disk with stuff I'll never use, >> and in this sense if BioPerl is divided into 10 packages I will have >> to think about each one whether I need it, and do a separate CVS >> checkout - and regular update - of each one (though granted, I >> believe there are ways the multiple checkout and update thing can be >> taken care of). > > I agree; the fewer dependencies the better. We could divide it up > into a small, focused core package with only a few dependencies, and > 1-3 more containing the focused bits which require the most > maintenance (Graphics, SearchIO/Tools, etc). I worry about having > too many more. > >> In reality, this may be a rapidly disappearing trait though of those >> who have grown up in a time when they proudly spent all their savings >> to buy that new computer because it had a 20MB hard disk, compared to >> the two 360k floppy drives the previous one had. >> >> So don't ask me, just don't make it too hard for the dinosaurs. > > There would need to be some way of getting an old-style full-blown > core installation regardless of how many subdistros we would divy > core up into. My thought for CPAN was having Bundle::BioPerl take > over this but I'm not sure if it's still being used. Maybe there are > other ways for svn/cvs. Personally, I think this use of Bundle::Bioperl is more in line with what CPAN Bundles were meant to do - "a bundle is a collection of modules that comprise a cohesive unit". Under that definition you could probably put the whole of Bioperl but I won't go there! When a package is updated and a new release is made, this should be installable/updatable via cpan as well as updating the bundle with the correct version. This was you can get all of Bioperl via the bundle, or just install the sub-packages on their own. If the switch over to svn takes place, will all the Bioperl-* projects move over at the same time? If so, will they go into their own svn repository or into the same one? Since with svn you can checkout any subtree of the repository I'm not clear on the pro's and cons of either of these options. Am I right in thinking that there is a way for cvs to define a "project" such that when you checkout that "project" it actually checks out multiple projects behind the scene? I'm sure I've seen this somewhere, possibly when the project is dependent on some 3rd party code that is also in cvs. If this is possible, I'm sure it will also be possible with svn. This could then allow something like the following to happen after the split up of Bioperl. The following projects could be defined: bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" called "bioperl" would actually checkout the real projects call bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems that this ought to be possible, doesn't it? > >>> as it will directly impact projects which rely on core >>> functionality (GBrowse/GMOD, bioperl-db, etc). >> Well, I hope there are ways to limit that? > > I believe so, yes, particularly for bioperl-db. I would think > splitting off Bio::Graphics or Bio::DB* will have some effect on > GBrowse/GFF. > >>> I also agree with George that this should be postponed until after >>> svn issues are taken care of. >> I agree entirely. Please don't throw this in the sam. e bin or tie one >> to the other. The migration is neither easier nor faster nor better >> testable with a partitioned BioPerl. >> >> -hilmar > > We def. have to complete transition to subversion first, then think > about this some more. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4 op9sQTZyeK6G6taFhTAPMYc= =7NRw -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 07:46:16 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 07:46:16 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? They are under the same CVSROOT right now. Locking down some sub- repositories but not others may be odd or impossible. > If so, will they go into their own svn repository or into the same > one? Good question, I'm not sure about the pros and cons one way or the other either. The fewer repositories the less sysadmin work in fine- graining permissions. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8 Ims4d150lsX0vXtDwGI1lKg= =K4++ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Wed Jun 20 07:57:22 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 12:57:22 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> Message-ID: <46791622.6080409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > >> If the switch over to svn takes place, will all the Bioperl-* projects >> move over at the same time? > > They are under the same CVSROOT right now. Locking down some > sub-repositories but not others may be odd or impossible. > >> If so, will they go into their own svn repository or into the same one? > > Good question, I'm not sure about the pros and cons one way or the other > either. The fewer repositories the less sysadmin work in fine-graining > permissions. > > -hilmar > I don't think there is any major reason why the following single repos wouldn't do the trick: /-- |-bioperl-live | |--- trunk | |--- branches | |--- tags | |-bioperl-run |--- trunk |--- branches |--- tags Any reason why this couldn't be used? I know some people don't like the idea of the revision number incrementing for the whole repository if it contains several "projects". However, revision numbers are really only a way for svn to keep track of things and a very large revision number shouldn't really "upset" anyone. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA 1Vj8BSUnanpdjYYLE6eGanU= =bOqK -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 08:08:33 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 08:08:33 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? That would work fine except that there are several more sub-projects (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). That should still be fine. I think what needs to be recognized is the limitations it puts on permission granularity. If it's all the same repository (as is now) then having commit rights to one (subproject) will mean commit rights to all. From my perspective that's fine, it has worked great so far. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1 hckjT7LBtHcmwGI8B+BKQIM= =gYfA -----END PGP SIGNATURE----- From hartzell at alerce.com Tue Jun 19 15:53:39 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 19 Jun 2007 12:53:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <18040.13379.217277.992742@almost.alerce.com> Steve Chervitz writes: > On 6/16/07, Jason Stajich wrote: > > [...] > > Just to say I already went through all the steps of running cvs2svn > > myself and had problems gathering back out the branches and all the > > tags when I tried it. If you want to start with a smaller repository > > like bioperl-network or bioperl-db as the initial cvs2svn conversion > > script took quite a long time to run on bioperl-live. > > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? [...] I'd say that the time to do this kind of rearrangement would be *after* the svn repo's set up. That way you'll be able to track stuff back through to the beginning of time. g. From sdavis2 at mail.nih.gov Wed Jun 20 08:44:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 20 Jun 2007 08:44:08 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <46792118.4030205@mail.nih.gov> Hilmar Lapp wrote: > > On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > >> I don't think there is any major reason why the following single repos >> wouldn't do the trick: > >> /-- >> |-bioperl-live >> | |--- trunk >> | |--- branches >> | |--- tags >> | >> |-bioperl-run >> |--- trunk >> |--- branches >> |--- tags > >> Any reason why this couldn't be used? > > That would work fine except that there are several more sub-projects > (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). > > That should still be fine. I think what needs to be recognized is the > limitations it puts on permission granularity. If it's all the same > repository (as is now) then having commit rights to one (subproject) > will mean commit rights to all. From my perspective that's fine, it > has worked great so far. Actually, I think there are ways of creating per-directory access control. See here: http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general With Apache-based https access, such access control is relatively straightforward, it appears. With the standalone svn server over ssh, one needs to use "commit hook scripts" to limit access. But I think it is possible (admitting that I have not tried to do this...). Sean From hartzell at alerce.com Wed Jun 20 09:23:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:23:32 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <18041.10836.728079.835572@almost.alerce.com> Nathan S. Haigh writes: > [...] > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? If so, will they go into their own svn > repository or into the same one? Since with svn you can checkout any > subtree of the repository I'm not clear on the pro's and cons of either > of these options. I'm planning to drop the projects from the top of the CVSROOT into a single svn repository: bioperl-ext bioperl-pipeline biodata bioperl-gui bioperl-run bioperl-cookbook bioperl-live biosql-schema bioperl-corba-client bioperl-microarray html bioperl-corba-server bioperl-network task-manager bioperl-das-client bioperl-papers xml-html bioperl-db bioperl-pedigree although that's open to feedback from the core members. As a progress report, I've built a demo repos with -run, -ext, and -live in it and asked a couple of folks to to take a peek at it. When I get a bit further along I'll figure out how to get something for the public to test. > Am I right in thinking that there is a way for cvs to define a "project" > such that when you checkout that "project" it actually checks out > multiple projects behind the scene? I'm sure I've seen this somewhere, > possibly when the project is dependent on some 3rd party code that is > also in cvs. If this is possible, I'm sure it will also be possible with > svn. This could then allow something like the following to happen after > the split up of Bioperl. The following projects could be defined: > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > called "bioperl" would actually checkout the real projects call > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > that this ought to be possible, doesn't it? > [...] I don't think that there's any functionality like that in svn. g. From hartzell at alerce.com Wed Jun 20 09:26:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:26:04 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <18041.10988.375946.833182@almost.alerce.com> Nathan S. Haigh writes: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hilmar Lapp wrote: > > > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > > > >> If the switch over to svn takes place, will all the Bioperl-* projects > >> move over at the same time? > > > > They are under the same CVSROOT right now. Locking down some > > sub-repositories but not others may be odd or impossible. > > > >> If so, will they go into their own svn repository or into the same one? > > > > Good question, I'm not sure about the pros and cons one way or the other > > either. The fewer repositories the less sysadmin work in fine-graining > > permissions. > > > > -hilmar > > > > > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? > [...] That's exactly the way that I'm setting it up. g. From n.haigh at sheffield.ac.uk Wed Jun 20 09:33:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 14:33:33 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <18041.10836.728079.835572@almost.alerce.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <18041.10836.728079.835572@almost.alerce.com> Message-ID: <46792CAD.5060700@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Nathan S. Haigh writes: > > [...] > > If the switch over to svn takes place, will all the Bioperl-* projects > > move over at the same time? If so, will they go into their own svn > > repository or into the same one? Since with svn you can checkout any > > subtree of the repository I'm not clear on the pro's and cons of either > > of these options. > > I'm planning to drop the projects from the top of the CVSROOT into a > single svn repository: > > bioperl-ext bioperl-pipeline biodata bioperl-gui > bioperl-run bioperl-cookbook bioperl-live biosql-schema > bioperl-corba-client bioperl-microarray html bioperl-corba-server > bioperl-network task-manager bioperl-das-client bioperl-papers > xml-html bioperl-db bioperl-pedigree > > although that's open to feedback from the core members. > > As a progress report, I've built a demo repos with -run, -ext, and > -live in it and asked a couple of folks to to take a peek at it. When > I get a bit further along I'll figure out how to get something for the > public to test. Could I take a peek?? > > > Am I right in thinking that there is a way for cvs to define a "project" > > such that when you checkout that "project" it actually checks out > > multiple projects behind the scene? I'm sure I've seen this somewhere, > > possibly when the project is dependent on some 3rd party code that is > > also in cvs. If this is possible, I'm sure it will also be possible with > > svn. This could then allow something like the following to happen after > > the split up of Bioperl. The following projects could be defined: > > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > > called "bioperl" would actually checkout the real projects call > > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > > that this ought to be possible, doesn't it? > > [...] > > I don't think that there's any functionality like that in svn. I did come across this which might help: http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561 Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su sWDAmqFhGgtlyeawaIGSV14= =zeAY -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 20 11:38:20 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 20 Jun 2007 16:38:20 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm Message-ID: <467949EC.9040100@sendu.me.uk> In considering updating all the test scripts to take advantage of the new network option, and/or reimplementing them in Test::More, I thought now would be a good time to standardize all the test scripts and reduce the possibility of having to alter them all in the future if something changes. For example we could decide on an alternate way of choosing to run network tests, or a new way of deciding to output debug information. There are also some inconsistencies in the messages produced by tests skipping all, and even an unfortunate mistake that has been copy/pasted through a lot of test scripts. My solution is t/lib/BioperlTest.pm (documented with perldoc) We go from this: ---- use strict; our $DEBUG; BEGIN { $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; eval { require Test::More; }; if( $@ ) { use lib 't/lib'; } use Test::More; # the mistake! use Module::Build; my $build = Module::Build->current(); my $do_network_tests = $build->notes('network'); eval { require IO::String; require LWP; require LWP::UserAgent; }; if ($@) { plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed. This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests'; } elsif (!$do_network_tests) { plan skip_all => 'Network tests have not been requested, skipping all'; } else { plan tests => 21; } #... } my $obj = Bio::Object->new(-verbose => $DEBUG); #... ---- To this: ---- use strict; BEGIN { use lib 't/lib'; use BioperlTest; test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)], -requires_networking => 1, -tests => 21); #... } my $obj = Bio::Object->new(-verbose => test_debug()); #... ---- Can anyone identify problems with this approach? Is the interface presented by BioperlTest flexible enough that any changes would only be additions for new functionality (and therefore all test scripts wouldn't need to be altered)? Is BioperlTest missing anything you'd like? Are there any objections to me updating all tests in this manner? For an example, see t/RemoteBlast.t Cheers, Sendu. From spiros at lokku.com Wed Jun 20 11:49:48 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Wed, 20 Jun 2007 16:49:48 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> Message-ID: Yep, they are not all done. Some still need to be ported over, doing some here and there at home. However, the recent email Sendu sent, the one about abstracting the setup of testing is actually something i was thinking myself so it might be a better way to tackle the problem. For once it would save us from duplicating the same 30 lines of code across all tests. As far as network tests are involved, ive always been an avid hater of them. I believe they only bring more troubles than what they contribute due to the diversity of setups people have. My way of tackling them was always to group all the tests that required live access into one file and then forcibly just run that - iff needed and not by default. Like i said, thats just my opinion, ive been bitten by them one time too many. Spiros On 6/18/07, Chris Fields wrote: > > On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > > > Chris Fields wrote: > >> Couldn't you enable BIOPERLDEBUG, disable network access, then > >> iterate through tests checking for those which fail or skip? > > > > Yes, good idea, though my dev machine is also my email/webserver so > > I'd rather come up with an alternate solution than one involving > > 'disable network access'. > > > > Still, that's what I'll probably end up doing. Cheers! > > > > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > > to wait for you to finish, or join in? If you're not going to have > > time to do any more in the next few weeks, can you please update > > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > > in the opposite case, add your name in)? Its not quite clear to me > > which tests are assigned to whom. Can someone clarify what the > > markings mean? > > > > Cheers, > > Sendu. > > Not sure how far along spiros is; I handed it over after I finished > up to the 'Q' tests. In general the ones marked out have been > converted over, ones with names next to them have been claimed. If > you need help I'll prob. start back up again to finish them off; we > just need to divy them up. > > chris > From hlapp at gmx.net Wed Jun 20 12:27:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 12:27:47 -0400 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: Very cool! Sounds like a no-brainer to me to adopt this in all the tests. -hilmar On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > In considering updating all the test scripts to take advantage of the > new network option, and/or reimplementing them in Test::More, I > thought > now would be a good time to standardize all the test scripts and > reduce > the possibility of having to alter them all in the future if something > changes. > > For example we could decide on an alternate way of choosing to run > network tests, or a new way of deciding to output debug information. > There are also some inconsistencies in the messages produced by tests > skipping all, and even an unfortunate mistake that has been copy/ > pasted > through a lot of test scripts. > > My solution is t/lib/BioperlTest.pm (documented with perldoc) > > We go from this: > > ---- > use strict; > our $DEBUG; > > BEGIN { > $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > > eval { require Test::More; }; > if( $@ ) { > use lib 't/lib'; > } > use Test::More; # the mistake! > > use Module::Build; > my $build = Module::Build->current(); > my $do_network_tests = $build->notes('network'); > > eval { > require IO::String; > require LWP; > require LWP::UserAgent; > }; > if ($@) { > plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > installed. > This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > tests'; > } > elsif (!$do_network_tests) { > plan skip_all => 'Network tests have not been requested, skipping > all'; > } > else { > plan tests => 21; > } > > #... > } > > my $obj = Bio::Object->new(-verbose => $DEBUG); > #... > ---- > > To this: > > ---- > use strict; > > BEGIN { > use lib 't/lib'; > use BioperlTest; > > test_begin(-requires_modules => [qw(IO::String LWP > LWP::UserAgent)], > -requires_networking => 1, > -tests => 21); > > #... > } > > my $obj = Bio::Object->new(-verbose => test_debug()); > #... > ---- > > > Can anyone identify problems with this approach? Is the interface > presented by BioperlTest flexible enough that any changes would > only be > additions for new functionality (and therefore all test scripts > wouldn't > need to be altered)? Is BioperlTest missing anything you'd like? > > Are there any objections to me updating all tests in this manner? > For an > example, see t/RemoteBlast.t > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 20 12:44:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 11:44:01 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: Agreed! You've already created an example case so there's something to go off of. I plan on changing some EUtilities tests soon so I'll try implementing this, basing off your RemoteBlast.t implementation. Seems clear enough on the surface; if I run into problems I'll post. chris On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > Very cool! Sounds like a no-brainer to me to adopt this in all the > tests. -hilmar > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > >> In considering updating all the test scripts to take advantage of the >> new network option, and/or reimplementing them in Test::More, I >> thought >> now would be a good time to standardize all the test scripts and >> reduce >> the possibility of having to alter them all in the future if >> something >> changes. >> >> For example we could decide on an alternate way of choosing to run >> network tests, or a new way of deciding to output debug information. >> There are also some inconsistencies in the messages produced by tests >> skipping all, and even an unfortunate mistake that has been copy/ >> pasted >> through a lot of test scripts. >> >> My solution is t/lib/BioperlTest.pm (documented with perldoc) >> >> We go from this: >> >> ---- >> use strict; >> our $DEBUG; >> >> BEGIN { >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; >> >> eval { require Test::More; }; >> if( $@ ) { >> use lib 't/lib'; >> } >> use Test::More; # the mistake! >> >> use Module::Build; >> my $build = Module::Build->current(); >> my $do_network_tests = $build->notes('network'); >> >> eval { >> require IO::String; >> require LWP; >> require LWP::UserAgent; >> }; >> if ($@) { >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot >> installed. >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping >> tests'; >> } >> elsif (!$do_network_tests) { >> plan skip_all => 'Network tests have not been requested, >> skipping >> all'; >> } >> else { >> plan tests => 21; >> } >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => $DEBUG); >> #... >> ---- >> >> To this: >> >> ---- >> use strict; >> >> BEGIN { >> use lib 't/lib'; >> use BioperlTest; >> >> test_begin(-requires_modules => [qw(IO::String LWP >> LWP::UserAgent)], >> -requires_networking => 1, >> -tests => 21); >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => test_debug()); >> #... >> ---- >> >> >> Can anyone identify problems with this approach? Is the interface >> presented by BioperlTest flexible enough that any changes would >> only be >> additions for new functionality (and therefore all test scripts >> wouldn't >> need to be altered)? Is BioperlTest missing anything you'd like? >> >> Are there any objections to me updating all tests in this manner? >> For an >> example, see t/RemoteBlast.t >> >> >> Cheers, >> Sendu. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From wollenbergk at mail.nih.gov Wed Jun 20 14:11:04 2007 From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID)) Date: Wed, 20 Jun 2007 14:11:04 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others Message-ID: Greetings: I am working on a script to take a list of sequence IDs, extract the sequences from GenPept, and then run a BLAST search for each of the retrieved sequences. I am having a problem with the sequence retrieval, where some sequences are found and others are not and it's not obvious to me why this is. For example, using a text file containing the two following IDs as input: SKG3_YEAST NEM1_YEAST My script while( ) { chomp; my $seqid = $_; my $seq_obj = get_sequence( 'genpept', $seqid ); } will create a sequence object for the first ID, (print "Accession of ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession number) but for the second I am told -------------------- WARNING --------------------- MSG: id (NEM1_YEAST) does not exist --------------------------------------------------- When I pull up these records using the Entrez cross-databse search in my web browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using these search terms). In both records these IDs reside in the same field ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one but not the other. Any advice would be greatly appreciated. Cheers, Kurt Wollenberg, Ph.D. Phylogenetics and Sequence Analysis Consultant Biocomputing Research Consulting Section Bioinformatics and Scientific IT Program (BSIP) NIH/NIAID/OTIS Contractor, Lockheed Martin http://bioinformatics.niaid.nih.gov Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives. From bosborne11 at verizon.net Wed Jun 20 14:59:39 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 20 Jun 2007 14:59:39 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: Message-ID: Kurt, I can't answer your question but I wouldn't use Bio::Perl myself, I'd use Bio::DB::GenPept: 501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq = $db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;' MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~> It's true that Bio::Perl is easy-to-use but it's also _very_ limited. Brian O. On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)" wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence retrieval, > where some sequences are found and others are not and it's not obvious to me > why this is. > > For example, using a text file containing the two following IDs as input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using > these search terms). In both records these IDs reside in the same field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is confidential > and may contain sensitive information. It should not be used by anyone who > is not the original intended recipient. If you have received this e-mail in > error please inform the sender and delete it from your mailbox or any other > storage devices. National Institute of Allergy and Infectious Diseases shall > not accept liability for any statements made that are sender's own and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Jun 20 16:11:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 15:11:34 -0500 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: References: Message-ID: I'm assuming you are using the Bio::Perl exported sub get_sequence (). I am able to reproduce the issue using bioperl-live; it's an odd issue as direct use of Bio::DB::GenPept works fine: use Bio::DB::GenPept; my $factory = Bio::DB::GenPept->new(); my @accs = qw(SKG3_YEAST NEM1_YEAST); my $io = $factory->get_Stream_by_acc(\@accs); while (my $seq = $io->next_seq) { print "Accession:",$seq->accession,"\n"; } chris On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence > retrieval, > where some sequences are found and others are not and it's not > obvious to me > why this is. > > For example, using a text file containing the two following IDs as > input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct > accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search > in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST > (using > these search terms). In both records these IDs reside in the same > field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence > finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is > confidential > and may contain sensitive information. It should not be used by > anyone who > is not the original intended recipient. If you have received this e- > mail in > error please inform the sender and delete it from your mailbox or > any other > storage devices. National Institute of Allergy and Infectious > Diseases shall > not accept liability for any statements made that are sender's own > and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sac at bioperl.org Thu Jun 21 02:32:47 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 20 Jun 2007 23:32:47 -0700 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com> Looks like a nice refactor. After it's in place, don't forget to update the wiki: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Steve On 6/20/07, Chris Fields wrote: > Agreed! You've already created an example case so there's something > to go off of. > > I plan on changing some EUtilities tests soon so I'll try > implementing this, basing off your RemoteBlast.t implementation. > Seems clear enough on the surface; if I run into problems I'll post. > > chris > > On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > > > Very cool! Sounds like a no-brainer to me to adopt this in all the > > tests. -hilmar > > > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > > > >> In considering updating all the test scripts to take advantage of the > >> new network option, and/or reimplementing them in Test::More, I > >> thought > >> now would be a good time to standardize all the test scripts and > >> reduce > >> the possibility of having to alter them all in the future if > >> something > >> changes. > >> > >> For example we could decide on an alternate way of choosing to run > >> network tests, or a new way of deciding to output debug information. > >> There are also some inconsistencies in the messages produced by tests > >> skipping all, and even an unfortunate mistake that has been copy/ > >> pasted > >> through a lot of test scripts. > >> > >> My solution is t/lib/BioperlTest.pm (documented with perldoc) > >> > >> We go from this: > >> > >> ---- > >> use strict; > >> our $DEBUG; > >> > >> BEGIN { > >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > >> > >> eval { require Test::More; }; > >> if( $@ ) { > >> use lib 't/lib'; > >> } > >> use Test::More; # the mistake! > >> > >> use Module::Build; > >> my $build = Module::Build->current(); > >> my $do_network_tests = $build->notes('network'); > >> > >> eval { > >> require IO::String; > >> require LWP; > >> require LWP::UserAgent; > >> }; > >> if ($@) { > >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > >> installed. > >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > >> tests'; > >> } > >> elsif (!$do_network_tests) { > >> plan skip_all => 'Network tests have not been requested, > >> skipping > >> all'; > >> } > >> else { > >> plan tests => 21; > >> } > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => $DEBUG); > >> #... > >> ---- > >> > >> To this: > >> > >> ---- > >> use strict; > >> > >> BEGIN { > >> use lib 't/lib'; > >> use BioperlTest; > >> > >> test_begin(-requires_modules => [qw(IO::String LWP > >> LWP::UserAgent)], > >> -requires_networking => 1, > >> -tests => 21); > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => test_debug()); > >> #... > >> ---- > >> > >> > >> Can anyone identify problems with this approach? Is the interface > >> presented by BioperlTest flexible enough that any changes would > >> only be > >> additions for new functionality (and therefore all test scripts > >> wouldn't > >> need to be altered)? Is BioperlTest missing anything you'd like? > >> > >> Are there any objections to me updating all tests in this manner? > >> For an > >> example, see t/RemoteBlast.t > >> > >> > >> Cheers, > >> Sendu. > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From staffa at niehs.nih.gov Thu Jun 21 14:36:12 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Thu, 21 Jun 2007 14:36:12 -0400 Subject: [Bioperl-l] BIO::DB::FASTA ID Message-ID: This program below returns only 1527 IDs from a fasta file that I have constructed, which has mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa 1820 . It actually does not return the first 3 ids, nor the 5th, nor 7..36, 38,39,41..44...... The header lines are of variable length and the sequence lines are 80 characters except at the ends when they might be shorter. Is there some caveat that I am ignoring in my format that breaks bio::db::fasta? #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; $|=1; # # my $Dpse_UTR_file_for_T_orthologs = "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; my $db = Bio::DB::Fasta->new ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', -reindex, -makeid => \&make_my_id); my @ids = $db->ids; my $number_in = @ids; print "number of Dpse IDs = $number_in\n"; foreach my $id (@ids){ print "$id\n"; } sub make_my_id { # parse header line: # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT my $line = shift; # print "line = $line\n"; $line =~ />(\w+) /; my $ID = $1; # print "ID = $ID\n"; return $ID; } -------------- next part -------------- A non-text attachment was scrubbed... Name: T_orthologs_Dpse_genes.fa Type: application/octet-stream Size: 5033676 bytes Desc: not available URL: From jason at bioperl.org Thu Jun 21 17:19:14 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 21 Jun 2007 14:19:14 -0700 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: Hey Nick - I think a) your IDs are not unique b) you need to declare the function make_my_id BEFORE your call Bio::DB::Fasta->new if you want your function to be used. $ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort | uniq | wc -l 1527 -jason On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 > TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From mkiwala at watson.wustl.edu Thu Jun 21 17:23:46 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Thu, 21 Jun 2007 16:23:46 -0500 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: <467AEC62.2040508@watson.wustl.edu> You only have 1527 unique id's in the file. ~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\ -f1|sort -u|wc -l 1527 Change your make_id function to make sure the id's are unique. Staffa, Nick (NIH/NIEHS) wrote: > This program below returns only 1527 IDs from a fasta file that I have > constructed, which has > mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa > 1820 > . > It actually does not return the first 3 ids, > nor the 5th, nor 7..36, 38,39,41..44...... > The header lines are of variable length and the sequence lines are 80 > characters except at the ends when they might be shorter. > Is there some caveat that I am ignoring in my format that breaks > bio::db::fasta? > > > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Jun 25 09:06:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:06:27 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: <467FBDD3.8050009@sendu.me.uk> Sendu Bala wrote: > In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm I'm now in the process of converting all test scripts. In addition to those things mentioned previously, BioperlTest now also provides the methods test_input_file() and test_output_file(). This: ---- use Bio::Root::IO; my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); $obj->new(-file => ">$output_file"); END { unlink($output_file); } ... $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); ---- Becomes this: ---- my $output_file = test_output_file(); $obj->new(-file => ">$output_file"); ... $obj->new(-file => test_input_file('input.file')); ---- I should think the benefits are obvious, especially for the output files, which thanks to inconsistency of using END blocks correctly or at all, leaves some output data behind on occasion. test_input_file() is helpful for the shorthand, but also gets rid of many tests' usage of Bio::Root::IO (relying on something you're installing and testing in another test script to work in the current test script, without testing it in your own test script seems like a no-no to me). From cjfields at uiuc.edu Mon Jun 25 09:39:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:39:21 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] t/lib/ >> BioperlTest.pm > > I'm now in the process of converting all test scripts. In addition to > those things mentioned previously, BioperlTest now also provides the > methods test_input_file() and test_output_file(). > > > This: > ---- > use Bio::Root::IO; > my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); > $obj->new(-file => ">$output_file"); > > END { > unlink($output_file); > } > > ... > > $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); > ---- > > > Becomes this: > ---- > my $output_file = test_output_file(); > $obj->new(-file => ">$output_file"); > > ... > > $obj->new(-file => test_input_file('input.file')); > ---- > > > I should think the benefits are obvious, especially for the output > files, which thanks to inconsistency of using END blocks correctly > or at > all, leaves some output data behind on occasion. Sounds fine by me, though it's a lot of work. BTW, did we ever decide whether to finish up with Test::More conversion? I haven't heard back yet; let me know what you want to do. > test_input_file() is helpful for the shorthand, but also gets rid of > many tests' usage of Bio::Root::IO (relying on something you're > installing and testing in another test script to work in the current > test script, without testing it in your own test script seems like a > no-no to me). Well, in a way isn't that itself a test of the class (whether it breaks or not)? ; > Do test_input_file() and test_input_file() handle directory structures in an OS-safe way like catfile()? For instance, I plan on adding test data to a new directory similar to Bio::Graphics (t/data/ eutil) to prevent cluttering of the t/data directory. I could use '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base directory is 't/data' but that may not be cross-platform compatible with win32 file systems, which may still expect something like 't\data \eutil\input.xml'. chris From bix at sendu.me.uk Mon Jun 25 09:45:23 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:45:23 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> Message-ID: <467FC6F3.6080705@sendu.me.uk> Chris Fields wrote: > On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >> I should think the benefits are obvious, especially for the output >> files, which thanks to inconsistency of using END blocks correctly or at >> all, leaves some output data behind on occasion. > > Sounds fine by me, though it's a lot of work. BTW, did we ever decide > whether to finish up with Test::More conversion? I haven't heard back > yet; let me know what you want to do. I'm doing the remaining Test::More conversions at the same time. > Do test_input_file() and test_input_file() handle directory structures > in an OS-safe way like catfile()? For instance, I plan on adding test > data to a new directory similar to Bio::Graphics (t/data/eutil) to > prevent cluttering of the t/data directory. I could use > '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base > directory is 't/data' but that may not be cross-platform compatible with > win32 file systems, which may still expect something like > 't\data\eutil\input.xml'. Its platform-independent, currently implemented using File::Spec. So you'll say: $obj->new(-file => test_input_file('eutil', 'input.xml')); Its all documented in the POD of BioperlTest. From cjfields at uiuc.edu Mon Jun 25 09:49:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:49:51 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FC6F3.6080705@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> <467FC6F3.6080705@sendu.me.uk> Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu> On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >>> I should think the benefits are obvious, especially for the output >>> files, which thanks to inconsistency of using END blocks >>> correctly or at >>> all, leaves some output data behind on occasion. >> Sounds fine by me, though it's a lot of work. BTW, did we ever >> decide whether to finish up with Test::More conversion? I haven't >> heard back yet; let me know what you want to do. > > I'm doing the remaining Test::More conversions at the same time. Okay. Just didn't want to do any redundant work if it's already being/been done. >> Do test_input_file() and test_input_file() handle directory >> structures in an OS-safe way like catfile()? For instance, I plan >> on adding test data to a new directory similar to Bio::Graphics (t/ >> data/eutil) to prevent cluttering of the t/data directory. I >> could use '$obj->new(-file => test_input_file('/eutil/ >> input.xml'))' if the base directory is 't/data' but that may not >> be cross-platform compatible with win32 file systems, which may >> still expect something like 't\data\eutil\input.xml'. > > Its platform-independent, currently implemented using File::Spec. > So you'll say: > > $obj->new(-file => test_input_file('eutil', 'input.xml')); > > Its all documented in the POD of BioperlTest. yay! chris From mmokrejs at ribosome.natur.cuni.cz Mon Jun 25 12:06:24 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Mon, 25 Jun 2007 18:06:24 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <467254DD.3010505@mrc-lmb.cam.ac.uk> Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz> Dave Howorth wrote: > Martin MOKREJ? wrote: >>>> Also, there is a *huge* amount of documentation and examples on >>>> the BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> ? ;-) >> $ perl embl2picture.pl ~/99.gb | display - Error returned while >> evaluating value of 'description' option for glyph >> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl >> line 141, line 125. > > Hmm an error at line 141 of a 69 line script? Methinks you're not > actually running the script that's presented on the wiki page you > quoted. I cut-and-pasted the script and your file and it worked for me > (at least, it produced an image, along with a bunch of OOPS lines) Maybe you used the first version of the script? There are two or more scripts, I used the very last one. M. From cjfields at uiuc.edu Mon Jun 25 12:48:30 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 11:48:30 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> <467FE7B0.3010904@ribosome.natur.cuni.cz> Message-ID: Martin, Keep bioperl-related discussion on the bioperl mail list. The large majority of this isn't biopython-related, but maybe some devs there can add to this? On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote: ... > Would you please tell me exactly what is wrong with the spacing? Here's a section of the seq record attached to your previous email: DEFINITION . ACCESSION . VERSION . SOURCE . ORGANISM . Normally there is a fixed column width for any data present in a field, so it would look more like this: DEFINITION PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase); dihydroorotase [Arabidopsis thaliana]. ACCESSION NP_194024 VERSION NP_194024.1 GI:15235865 DBSOURCE REFSEQ: accession NM_118422.3 KEYWORDS . SOURCE Arabidopsis thaliana (thale cress) ORGANISM Arabidopsis thaliana Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons; rosids; eurosids II; Brassicales; Brassicaceae; Arabidopsis. Here's the relevant bit in the latest release notes: "The second part of each sequence entry record contains the information appropriate to its keyword, in positions 13 to 80 for keywords and positions 11 to 80 for the sequence." The bioperl devs try to make our parsers as flexible as possible but others may not, so it's something in ApE that should probably be fixed. And as mentioned to you several times in the past on the mail list and on bugzilla, don't expect sequence records which sway from the standard (in this case, the release notes) to parse correctly in all cases. We can try supporting some that sway from that standard but only up to a point. If it causes additional bugs, headaches, or degrades performance it won't be supported. > ... > Well, I just copy&pasted the script from the bioperl webpages, I think > from a tutorial or FAQ, don't remember anymore. Well, can't help you if you can't point out where the code originated from. We would like to know so it can be corrected. > ... > Well, my search for such tools available on Unix to be used in a > script, > non-interactively, completely failed. My last hope except getting > improved > ApE is to use the GenomeDiagram under biopython, but so far my .gb > files > cannot be parsed yet. :( > Martin As mentioned previously you will likely have to code for it yourself (perl or python) or help debug the relevant biopython code to get it working. We can't/won't do this for you unless/until it's something we feel warrants implementation. Judging by the bug list, we also haven't the time nor inclination to code for it. Sorry but we have other priorities besides doing your work for you. chris From jesper at krogh.cc Tue Jun 26 03:05:32 2007 From: jesper at krogh.cc (Jesper Krogh) Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST) Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Hi List. Trying to parse the embl database, the embl-parser fails on: AB019196 http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: AB019196 seems to have an invalid species classification. STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 STACK: Bio::SeqIO::embl::_read_EMBL_Species /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 STACK: Bio::SeqIO::embl::next_seq /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 STACK: -e:1 ----------------------------------------------------------- It seems to be dissatisfied with this: OS Acetobacter aceti OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. Thanks. -- Jesper Krogh From cjfields at uiuc.edu Tue Jun 26 09:13:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 08:13:50 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> I can verify this using bioperl-live. Can you file this as a bug? http://bugzilla.open-bio.org/ chris On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > Hi List. > > Trying to parse the embl database, the embl-parser fails on: AB019196 > http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: AB019196 seems to have an invalid species classification. > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 > STACK: Bio::SeqIO::embl::_read_EMBL_Species > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 > STACK: Bio::SeqIO::embl::next_seq > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 > STACK: -e:1 > ----------------------------------------------------------- > > > It seems to be dissatisfied with this: > OS Acetobacter aceti > OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; > OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. > > Thanks. > -- > Jesper Krogh > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From suji_ramin at yahoo.com Tue Jun 26 00:58:36 2007 From: suji_ramin at yahoo.com (SujiBala) Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT) Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com> Hi Hello This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. Error messasge Must supply a valid Bio::Align::AlignI for the _align parameter in the distance My program use Bio::AlignIO; use Bio::Align::DNAStatistics; use Bio::Tree::DistanceFactory; # for a dna alignment can also use ProteinStatistics @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); $stats = Bio::Align::DNAStatistics->new; $mat = $stats->distance( -align => @aln,-method => 'Kimura'); $dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ'); $tree = $dfactory->make_tree($mat); I am using clustalw formatted fasta file with more than one sequence SujiBala --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. From bartels.stefan at mh-hannover.de Tue Jun 26 05:26:03 2007 From: bartels.stefan at mh-hannover.de (don esteban) Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT) Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <11302459.post@talk.nabble.com> Try using the Proxyconfiguration in your script: $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; L Xu wrote: > > I do have the internet connection bu not use the proxy server. > I tested the network connection with ping command (below). The ncbi > website > does not response. Is there any special network setting needed for > connecting the ncbi website? > Thank you so much. > > C:\>ping www.yahoo.com > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > Ping statistics for 69.147.114.210: > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > Approximate round trip times in milli-seconds: > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > C:\>ping www.ncbi.nlm.nih.gov > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > Request timed out. > Request timed out. > Request timed out. > Request timed out. > > Ping statistics for 130.14.29.110: > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > = = = Original message = = = > > Judging by the output it looks like you have no network access or? can't > connect to the server (what remoteblast needs).? Make sure you? don't need > proxy settings. > > To preempt the next question, no, I'm not going to explain what a? proxy > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > tool... > > chris > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From rahall2 at ualr.edu Tue Jun 26 09:51:08 2007 From: rahall2 at ualr.edu (Roger Hall) Date: Tue, 26 Jun 2007 08:51:08 -0500 Subject: [Bioperl-l] Tuesday: ill Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2> Well I guess I won't be in today after all. Michael, Stephen, and Ames: please call me from the grad office at 10 on my cell phone (744-8514). Phil: please go ahead and meet with Tim, and let me know what questions remain afterwards. Thanks! Roger Hall Technical Director MidSouth Bioinformatics Center University of Arkansas at Little Rock (501) 569-8074 From cjfields at uiuc.edu Tue Jun 26 10:02:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 09:02:29 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <4681185D.5030402@cam.ac.uk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> <4681185D.5030402@cam.ac.uk> Message-ID: Ill try getting to that ASAP (as well as a few bugs). The problem is we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due to repeated code issues, something I'm trying to rectify with a new set of parsers. Just haven't had the time to work on them lately unfortunately. chris On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote: > Sorry, replied to this but forgot to cc the list. > > It looks like a related problem to bug 2288 that I filed about > Bio::SeqIO::swiss - the period after subgen. is what causes the > problems since it is interpreted as a seperator between nodes. I > put a patch in for Bio::SeqIO::swiss that works for me, but I guess > it might have side effects. > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. > > Chris Fields wrote: >> I can verify this using bioperl-live. Can you file this as a bug? >> http://bugzilla.open-bio.org/ >> chris >> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: >>> Hi List. >>> >>> Trying to parse the embl database, the embl-parser fails on: >>> AB019196 >>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >>> >>> >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: AB019196 seems to have an invalid species classification. >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ >>> Root.pm:359 >>> STACK: Bio::SeqIO::embl::_read_EMBL_Species >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >>> STACK: Bio::SeqIO::embl::next_seq >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >>> STACK: -e:1 >>> ----------------------------------------------------------- >>> >>> >>> It seems to be dissatisfied with this: >>> OS Acetobacter aceti >>> OC Bacteria; Proteobacteria; Alphaproteobacteria; >>> Rhodospirillales; >>> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >>> >>> Thanks. >>> -- >>> Jesper Krogh >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rrc22 at cam.ac.uk Tue Jun 26 09:45:01 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 26 Jun 2007 14:45:01 +0100 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> Message-ID: <4681185D.5030402@cam.ac.uk> Sorry, replied to this but forgot to cc the list. It looks like a related problem to bug 2288 that I filed about Bio::SeqIO::swiss - the period after subgen. is what causes the problems since it is interpreted as a seperator between nodes. I put a patch in for Bio::SeqIO::swiss that works for me, but I guess it might have side effects. Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. Chris Fields wrote: > I can verify this using bioperl-live. Can you file this as a bug? > > http://bugzilla.open-bio.org/ > > chris > > On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > >> Hi List. >> >> Trying to parse the embl database, the embl-parser fails on: AB019196 >> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: AB019196 seems to have an invalid species classification. >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 >> STACK: Bio::SeqIO::embl::_read_EMBL_Species >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >> STACK: Bio::SeqIO::embl::next_seq >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >> STACK: -e:1 >> ----------------------------------------------------------- >> >> >> It seems to be dissatisfied with this: >> OS Acetobacter aceti >> OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; >> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >> >> Thanks. >> -- >> Jesper Krogh >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Tue Jun 26 10:13:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 26 Jun 2007 15:13:48 +0100 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> Message-ID: <46811F1C.3020307@sendu.me.uk> SujiBala wrote: > Hi Hello > This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. > > Error messasge > Must supply a valid Bio::Align::AlignI for the _align parameter in the distance > My program > use Bio::AlignIO; > use Bio::Align::DNAStatistics; > use Bio::Tree::DistanceFactory; > # for a dna alignment can also use ProteinStatistics > @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); > $stats = Bio::Align::DNAStatistics->new; > $mat = $stats->distance( -align => @aln,-method => 'Kimura'); Without looking at the docs for these modules, it is immediately obvious that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO and not an array of alignments. It is also obvious that the -align => parameter for the distance() method can't take an array of anything (but probably an array ref?). Check the documentation and make sure you know what objects you're generating and passing around. From schlesi at ebi.ac.uk Tue Jun 26 10:59:13 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Tue, 26 Jun 2007 15:59:13 +0100 Subject: [Bioperl-l] PAML parser Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Hello, I am trying to use the PAML result parser (BioPerl Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. However on all outputs I have tested no result object is returned (next_result is undef). This includes the HIV and Lysin datasets included with PAML. My code is: my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => "/."); my $result = $codemlp->next_result; foreach my $model ( $result->get_NSSite_results ) { ... and the error is: Can't call method "get_NSSite_results" on an undefined value ... I can include the mlc file is needed. Is this supposed to work? Or do I have to run paml from bioperl to parse the results? Thanks Felix From Xianjun.Dong at bccs.uib.no Tue Jun 26 10:35:17 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 16:35:17 +0200 Subject: [Bioperl-l] bug for PAML::Baseml Message-ID: <46812425.8000509@ii.uib.no> An HTML attachment was scrubbed... URL: From Xianjun.Dong at bccs.uib.no Tue Jun 26 11:40:47 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 17:40:47 +0200 Subject: [Bioperl-l] bug for PAML::Baseml In-Reply-To: <46812425.8000509@ii.uib.no> References: <46812425.8000509@ii.uib.no> Message-ID: <4681337F.1000902@ii.uib.no> An HTML attachment was scrubbed... URL: From hartzell at alerce.com Tue Jun 26 14:12:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 14:12:04 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.22260.967524.353173@almost.alerce.com> There don't seem to be any .cvsignore files in the repository, or in CVSROOT/cvsignore. Am I missing something, or don't we use them? g. From cjfields at uiuc.edu Tue Jun 26 15:54:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 14:54:25 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu> Not sure. You may want to email support at open-bio.org; my guess is Chris D or Jason would have an answer. chris On Jun 26, 2007, at 1:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Tue Jun 26 15:55:21 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 26 Jun 2007 16:55:21 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: Maybe we've been using the default? On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Tue Jun 26 16:21:30 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 16:21:30 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.30026.61328.134490@almost.alerce.com> Chris Fields writes: > [...] > It looks like George Hartzell may be taking a crack at it, with > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > could have something testable relatively soon. After that we'll need > to work out a few other issues, basically what's on Hilmar's list. There's a repository on file:///home/hartzell/bioperl with all of the components projects in place. If you have a dev.open-bio.org account and you're in the bioperl group, you're good to get at it via: file:///home/hartzell/bioperl or svn+ssh://dev.open-bio.org/home/hartzell/bioperl There are a couple of things to think about: - how are we going to provide access. I *think* that I heard a decision to use http:// and https://. Who gets to set that up? - what do we want to do about keywords. The cvs2svn tool guesses and automatically sets the svn:keywords property to Author Date Revision and Id on many of the files in the tree. If it looks like it got it right, we can stick with it. Or, we can disable that conversion and I've cribbed a little script that'll grep out files using Id and set the svn:keywords property accordingly. - what do we want to do about svn:ignore? I haven't seen any .cvsignore files. Beyond that, how does the repo look? How are we going to cut over? Are we going to try to push svn commits to the read-mostly CVS repo, or just keep it around for history's sake (I lean towards the latter). g. From jason at bioperl.org Tue Jun 26 19:22:20 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:22:20 -0300 Subject: [Bioperl-l] PAML parser In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Message-ID: Can you make sure you have the latest and greatest version of these modules from the CVS repository? We had to fix things to parse 3.15 -- I can't tell if this is the problem or something else. You can also add -verbose => 1when you initialize the object and it may spit out more warnings about whether it is having problems. -jason On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote: > Hello, > > I am trying to use the PAML result parser (BioPerl > Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. > However on all outputs I have tested no result object is returned > (next_result is undef). This includes the HIV and Lysin datasets > included with PAML. > My code is: > > my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => > "/."); > my $result = $codemlp->next_result; > foreach my $model ( $result->get_NSSite_results ) { > ... > > and the error is: Can't call method "get_NSSite_results" on an > undefined value ... > > I can include the mlc file is needed. Is this supposed to work? Or do > I have to run paml from bioperl to parse the results? > > Thanks > Felix > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 19:27:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:27:05 -0300 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <46811F1C.3020307@sendu.me.uk> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> <46811F1C.3020307@sendu.me.uk> Message-ID: On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote: > SujiBala wrote: >> Hi Hello >> This is sujatha from singapore. I am trying to construct phylo >> tree using DNAStatistics and Kirma method. But I am getting the >> following error message. It would be nice if you could help me >> resolve this problem asap. >> >> Error messasge >> Must supply a valid Bio::Align::AlignI for the _align >> parameter in the distance >> My program >> use Bio::AlignIO; >> use Bio::Align::DNAStatistics; >> use Bio::Tree::DistanceFactory; >> # for a dna alignment can also use ProteinStatistics >> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); >> $stats = Bio::Align::DNAStatistics->new; >> $mat = $stats->distance( -align => @aln,-method => 'Kimura'); > yep you want to call next_aln on the Bio::AlignIO object. I fixed the example code in the HOWTO so it should work properly now; http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees > Without looking at the docs for these modules, it is immediately > obvious > that Bio::AlignIO->new() is going to return an instance of > Bio::AlignIO > and not an array of alignments. It is also obvious that the -align => > parameter for the distance() method can't take an array of anything > (but > probably an array ref?). > > Check the documentation and make sure you know what objects you're > generating and passing around. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 19:29:11 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:29:11 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org> We don't have one. I have one on my local machine that defined basically *~ and .#* so I never had a problem. Feel free to propose one if you think it is important, I never really though it was important. On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote: > Maybe we've been using the default? > > On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > >> >> There don't seem to be any .cvsignore files in the repository, or in >> CVSROOT/cvsignore. >> >> Am I missing something, or don't we use them? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From j_martin at lbl.gov Tue Jun 26 21:01:29 2007 From: j_martin at lbl.gov (Joel Martin) Date: Tue, 26 Jun 2007 18:01:29 -0700 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: <11302459.post@talk.nabble.com> References: <11302459.post@talk.nabble.com> Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org> Hello, The tutorial code snippet is an endless loop, I think it's supposed to remove the rid. As the only print statement you added is after the endless loop, you aren't seeing anything happen. Use the code from this instead, perldoc Bio::Tools::Run::RemoteBlast The bptutorial.pl does have a note that it's not useful and to read the pod for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code snippet you used. Though, as it's a tutorial example it might be nice to remove the while loop .. or at least add the sleep(5) part. http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29 Aside from that, you may have network issues but www.ncbi.nlm.nih.gov doesn't respond to ping as far as I can tell. Joel On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote: > > Try using the Proxyconfiguration in your script: > > $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; > > > > > L Xu wrote: > > > > I do have the internet connection bu not use the proxy server. > > I tested the network connection with ping command (below). The ncbi > > website > > does not response. Is there any special network setting needed for > > connecting the ncbi website? > > Thank you so much. > > > > C:\>ping www.yahoo.com > > > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > > > Ping statistics for 69.147.114.210: > > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > > Approximate round trip times in milli-seconds: > > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > > > C:\>ping www.ncbi.nlm.nih.gov > > > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > > > Request timed out. > > Request timed out. > > Request timed out. > > Request timed out. > > > > Ping statistics for 130.14.29.110: > > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > > > > > = = = Original message = = = > > > > Judging by the output it looks like you have no network access or? can't > > connect to the server (what remoteblast needs).? Make sure you? don't need > > proxy settings. > > > > To preempt the next question, no, I'm not going to explain what a? proxy > > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > > tool... > > > > chris > > > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > > > > ... > > -------------------- WARNING --------------------- > > MSG: > > An Error Occurred > > > >

An Error Occurred

> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > > > > > --------------------------------------------------- > > ... > > > > ___________________________________________________________ > > Sent by ePrompter, the premier email notification software. > > Free download at http://www.ePrompter.com. > > > > _________________________________________________________________ > > Get a preview of Live Earth, the hottest event this summer - only on MSN > > http://liveearth.msn.com?source=msntaglineliveearthhm > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melvinp at pacific.net.sg Wed Jun 27 01:25:08 2007 From: melvinp at pacific.net.sg (Melvin P) Date: Wed, 27 Jun 2007 13:25:08 +0800 Subject: [Bioperl-l] finding statistics on AA Message-ID: <4681F4B4.8010609@pacific.net.sg> Hi, I am new to BioPerl. I am trying to find out if there is any class that I can use for occupancy number/occurrence counts, psuedo count, observed frequency etc given a few sequences of amino acid. For example, what is the observed frequency of residue i at position p. My objective is to analyze the information content. Thanks. From bix at sendu.me.uk Wed Jun 27 06:23:58 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 11:23:58 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <46823ABE.2080300@sendu.me.uk> Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] >> t/lib/BioperlTest.pm > > I'm now in the process of converting all test scripts. And I've now completed that job (for bioperl-live at least), except for t/EUtilities.t since I know Chris is working on it. In addition to converting to Test::More where necessary, I've also made all psuedo-TODO blocks real ones. Previously I had advised to use SKIP blocks instead since TODO blocks need a Test::Harness upgrade. However I think in the next release we ought to make such upgrading compulsory (which should be automatic when combined with compulsory usage of Module::Build and Test::More in turn: users simply have to update CPAN). The conversion to BioperlTest directly led to the discovery and fixing of 6 minor bugs, so was certainly not without merit. No user or developer needs to have BIOPERLDEBUG permanently set to true anymore. To run all tests you just have to answer yes to the BioDBGFF and networking questions of 'perl Build.PL'. With './Build test' you then get clean, easy-to-read output where it is obvious to see that we currently have these issues: t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in another thread. t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and t/Annotation.t all have TODO tests. If you know about those modules, now would be a great time to implement those TODOs! Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are deprecated' warnings. To debug a particular test you could say: BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t I've updated the HOWTO for writing test scripts: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests From cjfields at uiuc.edu Wed Jun 27 07:55:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 06:55:47 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except > for > t/EUtilities.t since I know Chris is working on it. The network tests will be much shorter; the bulk will be transferred to a new suite for the backend Bio::Tools:EUtilities parser (which will test static files in t/data/eutils, so no dynamic changes). > In addition to converting to Test::More where necessary, I've also > made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. > However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update > CPAN). Sounds good to me, but there may be some grumblings out there. Having specific TODOs are nice b/c we can test them w/o fails. Handy. > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to > true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those > modules, now > would be a great time to implement those TODOs! The RNA_SearchIO.t is from ERPIN output; there's no easy way to generate it beyond having the user supply the info (or having the program author change the output). Will have to look at the others to see what's involved; maybe something for the priority list? > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. I ran into this with XML::Simple data structures recently; there was an easy way around it via XML::Simple using forcearray(). It has to do with attempting to assign data to/from a hash in a specific way involving array references (though I can't remember exactly how; I slept since then). > To debug a particular test you could say: > BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t > > > I've updated the HOWTO for writing test scripts: > http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Good work! chris From schlesi at ebi.ac.uk Wed Jun 27 07:57:27 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Wed, 27 Jun 2007 12:57:27 +0100 Subject: [Bioperl-l] Selecting columns from alignment Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com> Hi, is there an elegant way to select columns from an alignment object fulfilling a certain property (for example less than x gaps)? Everything I can see from Align::AlignI seems to involve looking at the individual sequences, creating lots of slices and appending them. If there a better way in bioperl or failing that, does anyone know a software package with similar functionality (t-coffee has lots of filters for alignments, but nothing to select columns besides by position it seems). Ideally this would also return a mapping from old to new positions in one of the sequences of course. Thanks Felix From cjfields at uiuc.edu Wed Jun 27 10:36:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 09:36:41 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ... > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I managed to get it working using file://. Haven't tried svn+ssh yet but I've had persistent problems getting ssh to work properly on my macbook; not sure why yet but I haven't had time to play around with it. > There are a couple of things to think about: > > - how are we going to provide access. I *think* that I heard a > decision to use http:// and https://. Who gets to set that up? That hasn't been decided yet and will be up to a consensus of the core devs, but I think the odds are in favor of allowing https:// but against allowing http://. As for setup that could be anyone with admin privs, though it may be best left up to Chris D, Jason, or Mauricio. > - what do we want to do about keywords. The cvs2svn tool guesses > and automatically sets the svn:keywords property to Author Date > Revision and Id on many of the files in the tree. If it looks > like it got it right, we can stick with it. Or, we can disable > that conversion and I've cribbed a little script that'll grep out > files using Id and set the svn:keywords property accordingly. Probably again a consensus issue, but you can choose one route. My inclination is the former if it's easier. > - what do we want to do about svn:ignore? I haven't seen any > .cvsignore files. Not sure. I've never used one personally, but (as Jason suggests) if you have ideas for one you can propose them, or we can suggest devs set up svn::ignore locally. > Beyond that, how does the repo look? Seems fine, though a simple 'svn file:///home/hartzell/bioperl' checkout gets everything (all distros, branches, etc). We need to make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- live/trunk /live' or similar if they just want the latest core/db/etc. We'll also need to start a svn wiki page to show how to get relevant distros (similar in style probably to the cvs page, with dev information, how to set up ssh keys, https stuff, etc). > How are we going to cut over? > > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I think a clean cut-over. Everyone would be warned to hold commits for a day (lest they be lost), then probably do something in this order: - switch cvs to read-only except for svn commits - run a clean cvs2svn - set up svn as read/write - set up test commits to cvs via svn - disable cvs commit messages to bioperl-guts, enable svn commit messages in it's place. - push svn commits over to read-only cvs cvs >>must<< be read-only after that point (no cvs->svn commits), with write access only available through svn. If at some future point there is no reason to keep it around or that it is more trouble than it's worth, we can make a decision then on cvs's fate. > g. chris From rvos at interchange.ubc.ca Wed Jun 27 10:23:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT) Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point. Rutger From cjfields at uiuc.edu Wed Jun 27 11:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 10:18:03 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> On Jun 27, 2007, at 9:23 AM, rvos wrote: > >> Are we going to try to push svn commits to the read-mostly CVS repo, >> or just keep it around for history's sake (I lean towards the >> latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. > > Rutger Most projects make a clean break with cvs (no more commits) for the reasons you point out. Not sure how the other core devs feel about that but I could go for that; it would def. prevent headaches. We could keep cvs for the time being as read-only, with no svn->cvs syncing. There are few projects which have (as a phase-out plan) old read-only cvs repositories available, with an automatic svn->cvs commit following every new svn commit. Not sure how that works, esp. for branching/merging and so on which I could see potentially getting hairy. chris From cjfields at uiuc.edu Wed Jun 27 12:05:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 11:05:49 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ...If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl Did manage to get svn+ssh working (with some password harassment); core tests passed enough that I think everything's okay. If ssh keys are set up correctly (mine aren't) it should work fine. chris From dmessina at wustl.edu Wed Jun 27 12:27:32 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 11:27:32 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: > [Chris] > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around > with it. I just did a checkout and a test commit, both via svn+ssh -- works great for me. >> [George] >> >> - what do we want to do about keywords. The cvs2svn tool guesses >> and automatically sets the svn:keywords property to Author Date >> Revision and Id on many of the files in the tree. If it looks >> like it got it right, we can stick with it. Or, we can disable >> that conversion and I've cribbed a little script that'll grep out >> files using Id and set the svn:keywords property accordingly. I would think we would want "Author Date Id Rev URL" set on everything, no?. So either cvs2svn or your tool (whichever you think is better), followed by svn propset svn:keywords "Author Date Id Rev URL" * from the root of a working copy would take care of all of the existing files in the repository, I think. George knows more about this than I do, but I think you can set up a global config file with enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" to ensure it gets set on any future additions to the repository. >> - what do we want to do about svn:ignore? I haven't seen any >> .cvsignore files. > > Not sure. I've never used one personally, but (as Jason suggests) if > you have ideas for one you can propose them, or we can suggest devs > set up svn::ignore locally. I use the default global-ignores global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store (again, in my system-wide config file), but I'm not tied to that. I do think we should have one, though; individuals can easily override any settings in the system-wide config with their own ~/.subversion/ config. >> Beyond that, how does the repo look? Looks great, George! Thanks for doing this. Dave From hartzell at alerce.com Wed Jun 27 13:00:53 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 13:00:53 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <18050.38853.526224.791878@almost.alerce.com> rvos writes: > > > Are we going to try to push svn commits to the read-mostly CVS repo, > > or just keep it around for history's sake (I lean towards the latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. There had been some point of keeping a CVS repository around as a read-only mirror of the svn repo, presumably for people who's habits or setup won't let them use svn. In theory, each commit to the svn repo can be automagically pushed down into CVS w/out user intervention, google will tell you how but I've never run anything that way. g. From dmessina at wustl.edu Wed Jun 27 13:27:01 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 12:27:01 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu> > [Chris] > We'll also need to start a svn wiki page to show how to get relevant > distros (similar in style probably to the cvs page, with dev > information, how to set up ssh keys, https stuff, etc). I cloned the CVS page and have started adapting it for Subversion: http://www.bioperl.org/wiki/Using_Subversion I'll do some more on it later today, but if anyone wants to fiddle with it in the interim, please do. Dave From n.haigh at sheffield.ac.uk Wed Jun 27 14:44:16 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 19:44:16 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: <4682B000.2050707@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except for > t/EUtilities.t since I know Chris is working on it. > > > In addition to converting to Test::More where necessary, I've also made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update CPAN). > > > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those modules, now > would be a great time to implement those TODOs! > > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. Ah, that reminds me! I recently tried to do an install of the cvs head (a week or two ago) on a clean installation of Debian 4.0 (etch). During the installation, of dependencies, Bio::ASN1::EntrezGene threw an error as it depends on Bioperl. I seem to remember this circular dependency cropping up before - am I correct - and can you remind me how this was "fixed"? Cheers Nath From bix at sendu.me.uk Wed Jun 27 14:52:01 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 19:52:01 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B1D1.3080206@sendu.me.uk> Nathan S. Haigh wrote: > I recently tried to do an install of the cvs head (a week or two ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up before > - am I correct - and can you remind me how this was "fixed"? Yes, it always happens. It was 'fixed' by being completely ignored by me. Installation is guaranteed to fail, but if you really want it, trying to install again after you already have Bioperl installed will result in success. Clearly something nicer could be done. Suggestions on a postcard... From cjfields at uiuc.edu Wed Jun 27 15:01:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:01:01 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > Sendu Bala wrote: >> ... >> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >> deprecated' warnings. > > Ah, that reminds me! > > I recently tried to do an install of the cvs head (a week or two > ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up > before > - am I correct - and can you remind me how this was "fixed"? > > Cheers > Nath Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of Bioperl (and he could be come a dev). That would solve it. chris From n.haigh at sheffield.ac.uk Wed Jun 27 15:16:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 20:16:40 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B798.1010409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > >> Sendu Bala wrote: >>> ... >>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >>> deprecated' warnings. >> >> Ah, that reminds me! >> >> I recently tried to do an install of the cvs head (a week or two ago) on >> a clean installation of Debian 4.0 (etch). During the installation, of >> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on >> Bioperl. I seem to remember this circular dependency cropping up before >> - am I correct - and can you remind me how this was "fixed"? >> >> Cheers >> Nath > > Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of > Bioperl (and he could be come a dev). That would solve it. > > chris Just to put the feelers out to see what people think. It seems (to me at least) that Bioperl modules could/should? be released as individual modules and that "bioperl" would really constitute a "bundle" of all these modules - in terms of CPAN anyway. Am I correct in this thinking? The Bio::ASN1::EntrezGene could simply require a particular module rather than the whole of bioperl - might get out of the circular dependency theoretically!? I'm not suggesting moving in this direction, but just wondered what others thought about this concept? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X tOFQUQ7cGJLUITEDw1+QLxc= =Yc+g -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 15:31:44 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:31:44 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu> On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote: > ... > > Just to put the feelers out to see what people think. > > It seems (to me at least) that Bioperl modules could/should? be > released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I > correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? > > I'm not suggesting moving in this direction, but just wondered what > others thought about this concept? > > Nath Well, Steve suggested splitting some of core into distinct groups, which I tend to agree with in some respects (speed up releases for those modules, such as SearchIO, DB, Graphics). The problem we have yet to solve is what we consider 'core'. Is it Bio::Seq and related? Should it include Bio::DB*? Should it just be Bio::* modules with no or very few external dependencies? And so on..., probably not a decision we want to make immediately (until after svn migration, tests finished, maybe a release or two, a beer)... The Bioperl module dependency that Bio::ASN1::EntrezGene has is Bio::Index::AbstractSeq. You could try a test build of Bio::ASN1::EntrezGene to see what happens. chris From hlapp at gmx.net Wed Jun 27 15:49:15 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:49:15 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 1:27 PM, David Messina wrote: > I would think we would want "Author Date Id Rev URL" set on > everything, no?. So either cvs2svn or your tool (whichever you think > is better), followed by > > svn propset svn:keywords "Author Date Id Rev URL" * Shouldn't this be done recursively? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 15:50:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:50:27 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > Most projects make a clean break with cvs (no more commits) for the > reasons you point out. Not sure how the other core devs feel about > that but I could go for that; it would def. prevent headaches. There shouldn't be any cvs write support after the cut-over I think. I don't see the benefit that would justify the huge headache potential. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 16:01:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:01:40 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I > think. I don't see the benefit that would justify the huge headache > potential. > > -hilmar Agreed, so maybe we should set that in stone. That means no svn->cvs syncing post-migration as well, I assume. Now how about a quick straw poll, what kind of access? svn+ssh is already available, but some (Aaron among them) have indicated they would like https as well (not sure how involved it would be to set up). chris From hlapp at gmx.net Wed Jun 27 16:08:40 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:08:40 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > That means no svn->cvs syncing post-migration as well, I assume. That's a bit of a different story. People out there have URL links into our anonymous CVS repository. If it's not too troublesome (and tend to I think it's not) I'd like to maintain those in working order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi script that maps between the URL flavors (i.e., that maps a CVS-style URL to the equivalent SVN link). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 16:15:10 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 16:15:10 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18050.50510.84363.355034@almost.alerce.com> David Messina writes: > > [Chris] > > > > I managed to get it working using file://. Haven't tried svn+ssh yet > > but I've had persistent problems getting ssh to work properly on my > > macbook; not sure why yet but I haven't had time to play around > > with it. > > I just did a checkout and a test commit, both via svn+ssh -- works > great for me. Is there anyone working outside of bioperl-{run,live,ext}? g. From bix at sendu.me.uk Wed Jun 27 16:22:13 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 21:22:13 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <4682C6F5.4020406@sendu.me.uk> Nathan S. Haigh wrote: > It seems (to me at least) that Bioperl modules could/should? be released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? No, it wouldn't. The 'problem' only arises because the user is /choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same time. So even if Bioperl was released as separate modules there would still be that 'bundle' and users would still choose to do the same thing: install all the Bioperl modules as well as all its /optional/ recommended modules. And there lies the problem: Bio::ASN1::EntrezGene requires Bioperl modules, and one Bioperl module requires Bio::ASN1::EntrezGene, so the circularity isn't solved. (FYI: Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq Bio::Index::AbstractSeq requires a couple of Bioperl modules, including Bio::Root::Root Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of Bioperl modules, including Bio::Root::Root. ) You only avoid circularity by choosing not to install everything in one go. Which is something you can do right now with no problems. From n.haigh at sheffield.ac.uk Wed Jun 27 16:24:18 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 21:24:18 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <4682C772.5070502@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I think. > I don't see the benefit that would justify the huge headache potential. > > -hilmar I agree. A clean switch from cvs read/write to svn read/write plus cvs read only sounds the least problematic! However, how will links to cvs be dealt with? Links on Bioperl could be switched over to point to svn, but what about possible links from external sources? Maybe a more generic approach of redirection could work? Or a simple warning page stating the fact that we have moved from cvs to svn and provide a common link to follow? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y 81KurFwJlRtYFxSmLZP56Sk= =pp7b -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 16:30:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:30:19 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > Cool - this works for me. One thing I notice is that in cvs log you see which version is in which branch which is useful to answer user queries that might be a version problem. svn log doesn't seem to want to show that. Does anyone have ideas for how to do this in svn? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 16:32:18 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:32:18 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4682C772.5070502@sheffield.ac.uk> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <4682C772.5070502@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote: > However, how will links to cvs be dealt with? Well I said before that probably one can write a couple of lines of Perl to write a cgi script that returns the appropriate redirect URL with a redirect status code. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y f6sJ/ngeKEGpKHgyAHM1DAA= =8n0E -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 16:50:11 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:50:11 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote: > > On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl >> > > Cool - this works for me. > > One thing I notice is that in cvs log you see which version is in > which branch which is useful to answer user queries that might be a > version problem. svn log doesn't seem to want to show that. Does > anyone have ideas for how to do this in svn? > > -hilmar We prob. should move it to a new directory ASAP which george can write to when he needs to update. cvs is in /home/repository/ bioperl, so maybe something similar, like /home/svn/repository/bioperl? chris From cjfields at uiuc.edu Wed Jun 27 16:51:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:51:37 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu> On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > >> That means no svn->cvs syncing post-migration as well, I assume. > > That's a bit of a different story. People out there have URL links > into our anonymous CVS repository. If it's not too troublesome (and > tend to I think it's not) I'd like to maintain those in working > order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi > script that maps between the URL flavors (i.e., that maps a CVS- > style URL to the equivalent SVN link). > > -hilmar I'll try getting a wiki page up as a checklist for this, including what direction we're heading in, ideas (your list and CGI redirect ideas, svn::ignore issues, etc). Dave has already started on the 'getting bioperl using svn' wiki page. If we intend to sync cvs with svn we need to find the right tools or at least check for other projects which have done something similar. I haven't googled on that yet but I'll attempt to tonight. chris From cjfields at uiuc.edu Wed Jun 27 16:53:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:53:08 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> bioperl-run also. I think the run CVS repo has some binary files, so if there are any problems with cvs2svn it'll be there. chris On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > George, > > bioperl-db and bioperl-network should be included, I think. > > Brian O > > > On 6/27/07 4:15 PM, "George Hartzell" wrote: > >> David Messina writes: >>>> [Chris] >>>> >>>> I managed to get it working using file://. Haven't tried svn >>>> +ssh yet >>>> but I've had persistent problems getting ssh to work properly on my >>>> macbook; not sure why yet but I haven't had time to play around >>>> with it. >>> >>> I just did a checkout and a test commit, both via svn+ssh -- works >>> great for me. >> >> Is there anyone working outside of bioperl-{run,live,ext}? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Jun 27 17:05:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 22:05:50 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682C6F5.4020406@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> Message-ID: <4682D12E.3000803@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> It seems (to me at least) that Bioperl modules could/should? be released >> as individual modules and that "bioperl" would really constitute a >> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >> this thinking? The Bio::ASN1::EntrezGene could simply require a >> particular module rather than the whole of bioperl - might get out of >> the circular dependency theoretically!? > > No, it wouldn't. [snip] > You only avoid circularity by choosing not to install everything in one > go. Errr... I take that back. Since CPAN bundles install things in a certain order, you just have to make sure that everything Bio::ASN1::EntrezGene needs is installed first, then Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. But the main problem with this approach is that maintenance, global-style code improvements and releases become a nightmare. I could, perhaps, imagine a scenario where the repository stayed as-is (one monolithic collection), but the dist action of Build.PL could be altered to generate a release package per module instead of one big release package of all modules, as is currently the case. Is there much value in doing that? Does anyone want me to look into the feasibility of such a thing? From bosborne11 at verizon.net Wed Jun 27 16:19:47 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 27 Jun 2007 16:19:47 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18050.50510.84363.355034@almost.alerce.com> Message-ID: George, bioperl-db and bioperl-network should be included, I think. Brian O On 6/27/07 4:15 PM, "George Hartzell" wrote: > David Messina writes: >>> [Chris] >>> >>> I managed to get it working using file://. Haven't tried svn+ssh yet >>> but I've had persistent problems getting ssh to work properly on my >>> macbook; not sure why yet but I haven't had time to play around >>> with it. >> >> I just did a checkout and a test commit, both via svn+ssh -- works >> great for me. > > Is there anyone working outside of bioperl-{run,live,ext}? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Wed Jun 27 17:25:53 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 22:25:53 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <4682D5E1.2030507@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get out of >>> the circular dependency theoretically!? >> >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything in >> one go. > > Errr... I take that back. Since CPAN bundles install things in a certain > order, you just have to make sure that everything Bio::ASN1::EntrezGene > needs is installed first, then Bio::ASN1::EntrezGene, then > Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, > global-style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be altered > to generate a release package per module instead of one big release > package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into the > feasibility of such a thing? I think the value would be in other external modules being able to use bioperl modules with more ease (not sure how many modules have, or currently depend on bioperl) as they would depend on a single module, rather than the whole package. However, how would the dependencies of each module be handled? I'm clearly thinking aloud, but....Maybe this would tease apart "cliques" of modules that are interdependent? and could in themselves be shipped as bundles e.g. Bio::Graphics and have a "master" bioperl bundle that installa all the bioperl modules. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB 2EZjccEFEzfFlx4H47gzwLk= =nobl -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 17:35:28 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 18:35:28 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Is there a reason not to port every subproject over? -hilmar On Jun 27, 2007, at 5:53 PM, Chris Fields wrote: > bioperl-run also. I think the run CVS repo has some binary files, so > if there are any problems with cvs2svn it'll be there. > > chris > > On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > >> George, >> >> bioperl-db and bioperl-network should be included, I think. >> >> Brian O >> >> >> On 6/27/07 4:15 PM, "George Hartzell" wrote: >> >>> David Messina writes: >>>>> [Chris] >>>>> >>>>> I managed to get it working using file://. Haven't tried svn >>>>> +ssh yet >>>>> but I've had persistent problems getting ssh to work properly >>>>> on my >>>>> macbook; not sure why yet but I haven't had time to play around >>>>> with it. >>>> >>>> I just did a checkout and a test commit, both via svn+ssh -- works >>>> great for me. >>> >>> Is there anyone working outside of bioperl-{run,live,ext}? >>> >>> g. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 17:36:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:36:29 -0500 Subject: [Bioperl-l] Splits again, formerly Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be >>> released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I >>> correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get >>> out of >>> the circular dependency theoretically!? >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything >> in one go. > > Errr... I take that back. Since CPAN bundles install things in a > certain order, you just have to make sure that everything > Bio::ASN1::EntrezGene needs is installed first, then > Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, global- > style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be > altered to generate a release package per module instead of one big > release package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into > the feasibility of such a thing? Not for the time being, at least in my opinion. Too much on our plate at this point with svn migration, test conversion, bugzilla running over (next point of attack!), etc. Maybe something to think about after, though I like the idea of a few splits to core as Steve suggested (SearchIO, Graphics, some LWP-related DB modules). My (albeit extreme) thought is to have a lean-and-mean set of 'core' modules with as few external dependencies as possible, which could work around the circular dependency issue in this case: dep.on dep.on Bio::Auxiliary -----> ASN1::EntrezGene -----> core (with EntrezGene) (basic SeqIO, Index, DB, etc) \---->------>--- dep.on ->----->----->----/ Bioperl auxiliary modules would list core as a required dependency along with anything else needed for that particular aux. section (i.e. XML parsers, LWP, GD, etc.). The whole mess, if needed, would be installed using Bundle::BioPerl or similar, with no part released w/o testing on the whole 'base' to ensure proper interaction. If a fix needed to be made in one set, make the fix, test against bioperl 'base' as a whole, and release when possible. No need to wait for a full-fledged 1.5.3 release. Maybe wishful thinking... chris From cjfields at uiuc.edu Wed Jun 27 17:44:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:44:47 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> We should port them all, yes. chris On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > Is there a reason not to port every subproject over? > > -hilmar From cjfields at uiuc.edu Wed Jun 27 17:53:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:53:02 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: >> ... >> Is there much value in doing that? Does anyone want me to look >> into the >> feasibility of such a thing? > > > I think the value would be in other external modules being able to use > bioperl modules with more ease (not sure how many modules have, or > currently depend on bioperl) as they would depend on a single module, > rather than the whole package. However, how would the dependencies of > each module be handled? I'm clearly thinking aloud, but....Maybe this > would tease apart "cliques" of modules that are interdependent? and > could in themselves be shipped as bundles e.g. Bio::Graphics and > have a > "master" bioperl bundle that installa all the bioperl modules. See my response to Sendu, and Steve Chervitz's original post and related thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 which pretty much covers the same ground. I think at most 4-5 split 'cliques', including core, with the fewest possible dependencies in core. If we do any of this, it prob. should wait until after an svn migration and bugzilla bug stomping unless there is a (well-argued) advantage to doing it now. chris From n.haigh at sheffield.ac.uk Wed Jun 27 18:07:31 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 23:07:31 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> Message-ID: <4682DFA3.9090100@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: > >>> ... >>> Is there much value in doing that? Does anyone want me to look into the >>> feasibility of such a thing? >> >> >> I think the value would be in other external modules being able to use >> bioperl modules with more ease (not sure how many modules have, or >> currently depend on bioperl) as they would depend on a single module, >> rather than the whole package. However, how would the dependencies of >> each module be handled? I'm clearly thinking aloud, but....Maybe this >> would tease apart "cliques" of modules that are interdependent? and >> could in themselves be shipped as bundles e.g. Bio::Graphics and have a >> "master" bioperl bundle that installa all the bioperl modules. > > See my response to Sendu, and Steve Chervitz's original post and related > thread: > > http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315 > > which pretty much covers the same ground. I think at most 4-5 split > 'cliques', including core, with the fewest possible dependencies in > core. If we do any of this, it prob. should wait until after an svn > migration and bugzilla bug stomping unless there is a (well-argued) > advantage to doing it now. > > chris That's fine by me - or should I say, the best way forward - I was really just thinking aloud :) Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix TSi/e8PtYTwpxn6x+ewrjBs= =7Vp1 -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 27 18:43:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 23:43:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> Message-ID: <4682E824.1050507@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >> But the main problem with this approach is that maintenance, global- >> style code improvements and releases become a nightmare. I could, >> perhaps, imagine a scenario where the repository stayed as-is (one >> monolithic collection), but the dist action of Build.PL could be >> altered to generate a release package per module instead of one big >> release package of all modules, as is currently the case. >> >> Is there much value in doing that? Does anyone want me to look into >> the feasibility of such a thing? > > Not for the time being, at least in my opinion. Too much on our > plate at this point with svn migration, test conversion, bugzilla > running over (next point of attack!), etc. Maybe something to think > about after, though I like the idea of a few splits to core as Steve > suggested (SearchIO, Graphics, some LWP-related DB modules). [snip] > If a fix needed to be made in one set, make the fix, test against > bioperl 'base' as a whole, and release when possible. No need to > wait for a full-fledged 1.5.3 release. What advantage is there of these defined splits instead of individual modules? As I see it you lose some of the potential benefits of breaking Bioperl up completely, whilst also suffering the maintenance problems I outlined in my objection to Steve's post. Being able to work on all Bioperl from a single cvs (ne svn) check out/ archive, whilst distributing it as individual modules on CPAN seems like the best of both worlds to me. What am I missing? From hartzell at alerce.com Wed Jun 27 20:41:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:41:01 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> Message-ID: <18051.925.23313.932916@almost.alerce.com> Chris Fields writes: > [...] > We prob. should move it to a new directory ASAP which george can > write to when he needs to update. cvs is in /home/repository/ > bioperl, so maybe something similar, like /home/svn/repository/bioperl? I'd be parsimonious (lazy...) and go for /home/svn/bioperl. g. From hartzell at alerce.com Wed Jun 27 20:46:29 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:46:29 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <18051.1253.87485.235496@almost.alerce.com> Chris Fields writes: > [...] > Now how about a quick straw poll, what kind of access? svn+ssh is > already available, but some (Aaron among them) have indicated they > would like https as well (not sure how involved it would be to set up). What we do here, in large part, depends on what our host machine makes available to us. Is there an apache instance that we can use? Maybe a separate one? May someone among us configure it, or do we need to ask for help? (in other words, does anyone have sudo?) Is there some reason to not include http: (using Digest authentication so that passwords aren't passed in the clear?)? Maybe even go so far as to ask why bother with https:, it's not like we need to transfer any data encrypted.... g. From dmessina at wustl.edu Wed Jun 27 23:02:25 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 22:02:25 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > >> I would think we would want "Author Date Id Rev URL" set on >> everything, no?. So either cvs2svn or your tool (whichever you think >> is better), followed by >> >> svn propset svn:keywords "Author Date Id Rev URL" * > > Shouldn't this be done recursively? Yep, good catch! Thanks, Hilmar. Should be: svn propset --recursive svn:keywords "Author Date Id Rev URL" * From jason at bioperl.org Wed Jun 27 23:29:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:29:09 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.1253.87485.235496@almost.alerce.com> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: I think Chris D and I will need to confer a bit on https+svn. I don't know when we'll have a good chance to discuss everything. At some point this discussion is may need to be taken off bioperl and just the interested parties as we're delving into hardware geek land. The repository machine (dev) is a locked down machine meaning it only really runs ssh and not many servers include httpd. We have anonymous CVS (client and through httpd browsing) running on a separate machine (code) that has the info rsynced over every 10 or 15 minutes. The foundation websites and mailing lists run on a third machine (portal). If we decide to support https we'll need to spend a little time deciding how well we can keep it locked down - it will only be https not http for example and we may want to see about limiting ssh access to everyone if we migrate all OBF projects over to SVN and only support https. Again to re-iterate what I think we would do: - SVN read/write will live on 'dev', _WHEN_ we switch over no writes to the CVS repository. It will be available by ssh+svn and potentially by https+svn - SVN read-only will live on 'code', it will be accessible by http+svn - CVS read-only will live on 'code', this will only be a sync from the SVN to the CVS. See http://svn2cvs.tigris.org/ for details As I tried to ask for in the past, would someone also illustrate the importance of why _WE_ need to switch to SVN on a wiki page on Bioperl so that when someone complains/asks about this in the future the arguments are already laid out. I am basically fine with it, but I don't honestly see a compelling reason beyond what has been mentioned wrt better integration in IDEs. http://bioperl.org/wiki/Why_SVN -jason On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > Chris Fields writes: >> [...] >> Now how about a quick straw poll, what kind of access? svn+ssh is >> already available, but some (Aaron among them) have indicated they >> would like https as well (not sure how involved it would be to set >> up). > > What we do here, in large part, depends on what our host machine makes > available to us. > > Is there an apache instance that we can use? Maybe a separate one? > > May someone among us configure it, or do we need to ask for help? (in > other words, does anyone have sudo?) > > Is there some reason to not include http: (using Digest authentication > so that passwords aren't passed in the clear?)? Maybe even go so far > as to ask why bother with https:, it's not like we need to transfer > any data encrypted.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Wed Jun 27 23:51:32 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:51:32 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Hey guys - I'm wading in a bit late as I haven't had time to keep up with whole discussion. So you are suggesting 800+ individual CPAN modules? I don't think that is a good idea. Why would you split up Bio::Seq::RichSeq and Bio::Seq into two separate packages for example? I think if you really want to move away from the monolithic install it has to be more logical by function - but I am not that optimistic that this is going to actually be easier for people. Maybe I'm misunderstanding. What are the arguments for separating things -- to make it so people aren't scared by the number of modules so they'll code? It seems like some people just want it to be installed and run scripts - does having them install dozens of modules work. Do we need to consider people how much this would suck if someone can't use CPAN or Module::Builder to automate dependancy tracking installation? How does it work when modules are deprecated? I'm not sure I have made up my mind on what I'd like to see, but at some point I think we need to get a clearer idea of what audience we are trying to serve best. If want it to be easy to install maybe we should invest time into making OSX double-click installers, RPMs, and the Windows stuff easily installable. If we want to serve the developers who aren't using SVN so we want to push out releases of modules ASAP? I just am not clear on the motivation for some of the proposed changes. Also - the main point I wanted to make - Can I suggest we spend a little time discussing what it will take to get a stable release for the current code as it stands (bioperl-live and bioperl-run)? It seems like we really need to do this first so that we have a stable release that can be followed by CVS -> SVN migration, then consider major changes to the repository structure and release packaging, and potential deprecation and incorporation of other modules. I assume there is no chance that we'd have a 1.6 candidate by BOSC next month? Will it be productive to schedule a fair amount of time at BOSC discussing how to partition out the packages into separate sub- packages after we've done a successful release rather than trying to change things right now? I realize not everyone will be there but maybe it will be easier to interact on this then. I think it will also be time to talk with Lincoln/Scott about how Gbrowse is structured and if that is working for them. There is too much code in different places that I think we need to figure out how to structure it properly so those packages can be released. It would probably mean moving Bio::Graphics, Bio::DB::GFF and Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages so they could be released more regularly on par with Gbrowse schedules. Also I think someone needs to figure out Bio::Tools::GFF vs Bio::FeatureIO -- what do we want to do? I don't think we really fully support GFF3 that well -- the X2GFF scripts probably need some more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, etc... ) and or migration to the proper GFF writing. -jason On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >>> But the main problem with this approach is that maintenance, global- >>> style code improvements and releases become a nightmare. I could, >>> perhaps, imagine a scenario where the repository stayed as-is (one >>> monolithic collection), but the dist action of Build.PL could be >>> altered to generate a release package per module instead of one big >>> release package of all modules, as is currently the case. >>> >>> Is there much value in doing that? Does anyone want me to look into >>> the feasibility of such a thing? >> >> Not for the time being, at least in my opinion. Too much on our >> plate at this point with svn migration, test conversion, bugzilla >> running over (next point of attack!), etc. Maybe something to think >> about after, though I like the idea of a few splits to core as Steve >> suggested (SearchIO, Graphics, some LWP-related DB modules). > [snip] >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of individual > modules? As I see it you lose some of the potential benefits of > breaking > Bioperl up completely, whilst also suffering the maintenance > problems I > outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ > archive, whilst distributing it as individual modules on CPAN seems > like > the best of both worlds to me. What am I missing? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From chris at bioteam.net Thu Jun 28 00:08:25 2007 From: chris at bioteam.net (Chris Dagdigian) Date: Thu, 28 Jun 2007 00:08:25 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net> My understanding of "https+svn" is that it is actually WebDAV-over- HTTP which means that not only would we need to light up a HTTPD server on the developer box we'd also have to get a stable mod_dav module installed (sometimes not trivial) and then we would have to figure out how to handle the authentication bits. Right now with SSH we use Unix group permissions to figure out who can write to what repository -- WebDAV makes this a lot more complicated. Forcing encryption over https will prevent someone from sniffing a developer password which removes the main security issue. The next problem is going to be integrating the DAV module with Linux PAM so that existing usernames and passwords can be used, -OR- we have to set up and maintain an entirely separate set of username and password maps for each developer and each SVN project. I'm not super concerned about this -- BioTeam runs svn internally and we expose our SVN for employees both via WebDAV and SVN+SSH - it's not that hard to set up. My biggest concern really has to do with how much extra work this will mean for the OBF sysadmin team. If there is an easy way to get a stable Apache/DAV/SVN integration going with authentication coming from Linux PAM then this is no big deal. If we have to manually maintain separate authentication lists then it will be kind of a hassle. Like Jason mentioned, the OBF currently segregates "stuff" onto three different servers with three levels of security: - dev.open-bio.org -- Developers only, SSH access only (main sourcecode repository for OBF) - portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers and helpdesk.open-bio.org - code.open-bio.org -- "Disposable" anonymous access server that we can easily burn/wipe/reinstall if it ever gets hacked Everything else that Jason mentioned is fine and easy to set up (if not already running): - SVN+SSH for developers - Anonymous SVN and Anonymous RSYNC for community access on code.open-bio.org - svn2cvs for whomever wants it on code.open-bio.org - web based SVN code browser installed on http://code.open-bio.org Regards, Chris On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote: > I think Chris D and I will need to confer a bit on https+svn. I > don't know when we'll have a good chance to discuss everything. At > some point this discussion is may need to be taken off bioperl and > just the interested parties as we're delving into hardware geek land. > > The repository machine (dev) is a locked down machine meaning it > only really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or > 15 minutes. The foundation websites and mailing lists run on a > third machine (portal). > > > If we decide to support https we'll need to spend a little time > deciding how well we can keep it locked down - it will only be > https not http for example and we may want to see about limiting > ssh access to everyone if we migrate all OBF projects over to SVN > and only support https. > > Again to re-iterate what I think we would do: > - SVN read/write will live on 'dev', _WHEN_ we switch over no > writes to the CVS repository. It will be available by ssh+svn and > potentially by https+svn > - SVN read-only will live on 'code', it will be accessible by http > +svn > - CVS read-only will live on 'code', this will only be a sync from > the SVN to the CVS. See http://svn2cvs.tigris.org/ for details > > > As I tried to ask for in the past, would someone also illustrate > the importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the > future the arguments are already laid out. I am basically fine > with it, but I don't honestly see a compelling reason beyond what > has been mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN > > -jason > On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > >> Chris Fields writes: >>> [...] >>> Now how about a quick straw poll, what kind of access? svn+ssh is >>> already available, but some (Aaron among them) have indicated they >>> would like https as well (not sure how involved it would be to >>> set up). >> >> What we do here, in large part, depends on what our host machine >> makes >> available to us. >> >> Is there an apache instance that we can use? Maybe a separate one? >> >> May someone among us configure it, or do we need to ask for help? >> (in >> other words, does anyone have sudo?) >> >> Is there some reason to not include http: (using Digest >> authentication >> so that passwords aren't passed in the clear?)? Maybe even go so far >> as to ask why bother with https:, it's not like we need to transfer >> any data encrypted.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > From cjfields at uiuc.edu Thu Jun 28 00:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 23:18:03 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: > Chris Fields wrote: > ... >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of > individual modules? As I see it you lose some of the potential > benefits of breaking Bioperl up completely, whilst also suffering > the maintenance problems I outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ archive, whilst distributing it as individual modules on CPAN > seems like the best of both worlds to me. What am I missing? Okay, forewarned, but here's my long-winded reasoning. The short and sweet version: I (very) respectfully don't agree with you, at least re: the idea we should commit all modules to CPAN independently. It doesn't make any sense to me, but maybe you can elaborate more? Maybe I'm misinterpreting what you mean? Also, I agree with Steve C. that core is anything but a representation of a 'core' set of modules, and some sections could (should?) be split off into discrete, cohesive units. We may be alone in that camp, though it doesn't seem so (it's popped up more than a few times, in one form or another). If you want an in-depth explanation for both opinions, read on (below my sig), or feel free to bypass it. I'll understand. Finally, all of this should wait until later. Much later, like after a decent release, after svn, etc kind of 'later'. I think we can agree on that. . . . . . Still here? Okay... each issue (skip as needed): Individual CPAN modules: CPAN is not our personal versioning system; it may be if a distribution consists of only a few modules, but not when it's one of the largest distros present. If someone wants to update an individual bioperl module for a quick bug fix they are more than welcome to download it via cvs, svn, or even using a web browser, and replace the one they have. In most cases, it works w/o problems. With Module::Build you have even made it easier if a full installation is necessary. I'm trying to reason how one could break up the individual SeqIO/ SearchIO/otherIO modules into single module distributions. They are intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, which relies on the various interfaces, RootIO, and on down). How would tests be run off CPAN when the modules are distributed independently? Would they also be individually distributed? What would you use to tie all the individual modules together? How would you explain to the CPAN maintainers that you want to split bioperl into 990 individual modules, all updated independently, but intend on bundling them afterwards anyway? I'm failing to see the advantages to this approach, but if you can find an example where this was done successfully on CPAN or elsewhere maybe I could see what you mean. Splitting up core: As I see it, here are the advantages of a defined split as Steve and I see it (off the top of my head). Some of this probably reiterates my previous points, as well as Steve's, so apologies in advance. - A lean, mean, focused set of bioperl base modules (core) w/o or with very few external deps, minimal installation issues, etc. The very basic stuff to get up and running. - BioPerl bundled modules (Nathan's 'cliques') with defined, focused functionality, code, and tests, which add a bit more 'sugar' to the base functionality of the core. If you only care about parsing BLAST reports, get SearchIO, which requires core and optionally other modules (XML::SAX). If you want additional DB functionality apart from the very basic ones in core, install DB (with it's additional requirements, including core, DBI, and so on). Same with Graphics, Tools, Tree/Phylo, etc. We just need to define and limit the number of splits. - Easier to add additional bundled modules. For instance, I could focus all of my RNA work into a discrete set of modules (say, bioperl- rna) which I maintain, I ensure works with the latest core code, I ensure also plays well with the other children =) , and I distribute via CPAN. Same with EUtilities, which could go into a separated DB- related set or stay in core. - If we want a full-fledged 'install everything', the CPAN Bundle system is available. I think it's easier to use a Bundle for 4-5, even 10 groups of modules as opposed to over 900. - A Bundle or a build file where discrete distributions are listed (Bio::SearchIO, etc) wouldn't need to be updated every time a new module is added to a distribution. I suppose this could be automated, but why have the additional headache? - A chance to cut out some cruft. We all know that particular areas need work or a complete overhaul (Restriction, Structure, maybe a few others). Smaller, concentrated sets of modules I believe would be easier to maintain, and those that don't get use will eventually fall out of favor and may be lost or replaced from the more maintained group of modules. Survival of the fittest. - We already have had practice; bioperl-db, bioperl-run, bioperl- network, and others. Those that have been routinely maintained and enjoy wide use (db, run, network) have survived; others not so much (corba-related stuff, microarray, ext, etc., though the code is still available if someone else wants to take it up and revive it!). Disadvantages of a defined split: - The initial headache of identifying which groups go where, coordinating with those who rely on bioperl (GMOD, etc) on how this will be set up, so on... - Separate groups of modules require testing together to ensure functionality is consistent and maintained (something I think you pointed out previously). - I think an increased possibility of branching is possible. - Extra headaches for devs, who have to keep track of the various critical distributions and make sure they work well together. - Maybe others, but it's getting late here. Add more as needed; I'm sure there are a number more. chris From cjfields at uiuc.edu Thu Jun 28 01:17:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 00:17:01 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu> D'oh! Just when I wanted to go to bed. It's not fair, you're in California... On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote: > Hey guys - I'm wading in a bit late as I haven't had time to keep up > with whole discussion. > > So you are suggesting 800+ individual CPAN modules? I don't think > that is a good idea. Why would you split up Bio::Seq::RichSeq and > Bio::Seq into two separate packages for example? I think if you > really want to move away from the monolithic install it has to be > more logical by function - but I am not that optimistic that this is > going to actually be easier for people. Maybe I'm misunderstanding. Okay, so maybe it wasn't just me. > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? What I envision for core is maybe not just one distribution, but a cluster of distributions: base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated modules. Bare bones, with as few dependencies as possible. aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires additional modules. search - Bio::Search and SearchIO tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related stuff? graphics - Bio::Graphics. Maybe GMOD-related stuff here? The last four would list bioperl-core as a dependency themselves along with any other modules necessary. We could also have the core Build.PL ask the user if they want to install the other non-base distros, and maybe include bioperl-db, bioperl-network, and bioperl- run in the loop if requested. All would be installed as a bundle similar to Bundle::BioPerl, but have regular CPAN point releases (1.x.x) independently from one another i.e. for bug fixes, with a yearly/biyearly timed full release (1.x) of the whole shebang. Any point release for any 'core' distribution would have to be tested against the others prior to release. This is basically following Steve's train of thought, though more elaborated: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 > I'm not sure I have made up my mind on what I'd like to see, but at > some point I think we need to get a clearer idea of what audience we > are trying to serve best. If want it to be easy to install maybe we > should invest time into making OSX double-click installers, RPMs, and > the Windows stuff easily installable. If we want to serve the > developers who aren't using SVN so we want to push out releases of > modules ASAP? I just am not clear on the motivation for some of the > proposed changes. I think regular CPAN releases with updated PPMs hosted via portal work fine for the most part, but it would be nice to host RPMs. Others (Allen Day, for instance) have donated time to generate RPMs but they seem to lag behind a bit more. The original idea for svn arose from an unrelated thread with Mark Johnson discussing something (Glimmer maybe?) and took off from there. I was actually pretty surprised it took on a life of it's own. As for the motivation to switch, I haven't specifically used it myself, but the large number of responses seem to indicate others have and seem happy with it. Rutger Vos had also indicated he would move Bio::Phylo over to the repo if we used svn. We def. should address the issues you bring up (why _WE_ need svn) more succinctly but that shouldn't be an issue. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. Agreed. We prob. need to schedule a good couple of days (or so) to squash bugs. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? Um, not likely as nothing has been addressed Feature/Annotation-wise (overloads are still there, methods have not been deprecated, etc). There was an underlying assumption these would have an effect on GMOD- related stuff (I remember reading a post from Scott Cain in the mail archive mentioning something along these lines after the 1.5 release hubbub). Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall? > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I realize not everyone will be there but > maybe it will be easier to interact on this then. How many are going to be there? I can't go this year except on my own dime (which I don't have many of, student loans and all, sorry), though I'll likely be in a new lab by spring which is likely more amenable to funding. If there is a hackathon in the late fall (post- sept) I'll make it a point to go regardless. > I think it will also be time to talk with Lincoln/Scott about how > Gbrowse is structured and if that is working for them. There is too > much code in different places that I think we need to figure out how > to structure it properly so those packages can be released. It would > probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I don't think we really > fully support GFF3 that well -- the X2GFF scripts probably need some > more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, > etc... ) and or migration to the proper GFF writing. > > > -jason Will Lincoln or Scott be at BOSC? chris From dmessina at wustl.edu Thu Jun 28 01:21:58 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 00:21:58 -0500 Subject: [Bioperl-l] finding statistics on AA In-Reply-To: <4681F4B4.8010609@pacific.net.sg> References: <4681F4B4.8010609@pacific.net.sg> Message-ID: Hi Melvin, I don't think BioPerl has any information content-related code. I'm not terribly familiar with it myself, but the usual recommendation is to look at the EMBOSS package: http://en.wikipedia.org/wiki/EMBOSS Dave From bix at sendu.me.uk Thu Jun 28 02:38:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 07:38:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <46835778.5070901@sendu.me.uk> Jason Stajich wrote: > So you are suggesting ou are suggesting 800+ individual CPAN modules? > I don't think that is a good idea. Why would you split up > Bio::Seq::RichSeq and Bio::Seq into two separate packages for > example? I think if you really want to move away from the monolithic > install it has to be more logical by function - but I am not that > optimistic that this is going to actually be easier for people. > Maybe I'm misunderstanding. > > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? See my upcoming reply to Chris. Briefly, if the only change is to the dist action of Build.PL, we can make a single archive of all modules available to non-CPAN users, and individual modules available to CPAN users. No problems. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I'd recommend that a 'stable' release shouldn't happen until we resolve all the missing tests and bugzilla bugs (because I think the opportunity should be taken to have it stable both in terms of interface /and/ bugs). Which is a lot of work. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? None. From bix at sendu.me.uk Thu Jun 28 03:25:03 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 08:25:03 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <4683624F.6020402@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >> What advantage is there of these defined splits instead of >> individual modules? As I see it you lose some of the potential >> benefits of breaking Bioperl up completely, whilst also suffering >> the maintenance problems I outlined in my objection to Steve's post. >> >> Being able to work on all Bioperl from a single cvs (ne svn) check >> out/ archive, whilst distributing it as individual modules on CPAN >> seems like the best of both worlds to me. What am I missing? > > Okay, forewarned, but here's my long-winded reasoning. The short and > sweet version: I (very) respectfully don't agree with you, at least > re: the idea we should commit all modules to CPAN independently. It > doesn't make any sense to me, but maybe you can elaborate more? > Maybe I'm misinterpreting what you mean? The short and sweet version: my proposal has all the benefits of yours, but none of the disadvantages. What's not to like? > Finally, all of this should wait until later. Much later, like after > a decent release, after svn, etc kind of 'later'. I think we can > agree on that. Hmm, not really. If it can be implemented by a change in just Build.PL and ModuleBuildBioperl, its really independent of everything else. That's the beauty of it: the only thing that changes is how things are uploaded to and downloaded from CPAN. The only person that normally deals with that issue is the pumpkin for a release, and he only cares about it at release time. In fact, if we're going to do it at all it makes sense to try it out on a minor release like 1.5.3. We've already got experience of doing it split-style from 1.5.2. (And let me tell you: splits at the code-base level suck.) > Individual CPAN modules: > > CPAN is not our personal versioning system; it may be if a > distribution consists of only a few modules, but not when it's one of > the largest distros present. If someone wants to update an > individual bioperl module for a quick bug fix they are more than > welcome to download it via cvs, svn, or even using a web browser, and > replace the one they have. And where is the harm in letting them do it via CPAN as well? In fact, there are significant benefits: > I'm trying to reason how one could break up the individual SeqIO/ > SearchIO/otherIO modules into single module distributions. They are > intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, > which relies on the various interfaces, RootIO, and on down). How > would tests be run off CPAN when the modules are distributed > independently? Bio::SeqIO::genbank would have a dependency on the latest version of Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. So when a user wants to get the latest version of Bio::SeqIO::genbank, they no longer have to worry about what other modules in its dependency hierarchy they should also install. Instead they just request Bio::SeqIO::genbank which itself ensures you have the latest version of all its dependencies before installing itself and running its tests. When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank users should have, he could just call './Build dist Bio::SeqIO::genbank' which would generate a new package for Bio::SeqIO::genbank suitable for uploading to CPAN. No more long release cycles and having to constantly tell people to 'use CVS' to get working Bioperl code. > Would they also be individually distributed? What > would you use to tie all the individual modules together? How would > you explain to the CPAN maintainers that you want to split bioperl > into 990 individual modules, all updated independently, but intend on > bundling them afterwards anyway? They would be tied together by a CPAN bundle. You don't have to 'explain' anything to the CPAN maintainers because you're not doing anything wrong. In fact, you're using it the way you're supposed to. > Splitting up core: > > As I see it, here are the advantages of a defined split as Steve and > I see it (off the top of my head). Some of this probably reiterates > my previous points, as well as Steve's, so apologies in advance. Below I answer with how it would be with my single-module approach compared to the defined splits. > - A lean, mean, focused set of bioperl base modules (core) w/o or > with very few external deps, minimal installation issues, etc. The > very basic stuff to get up and running. Even leaner, even more focused. > - BioPerl bundled modules (Nathan's 'cliques') with defined, focused > functionality, code, and tests, which add a bit more 'sugar' to the > base functionality of the core. If you only care about parsing BLAST > reports, get SearchIO, which requires core and optionally other > modules (XML::SAX). If you want additional DB functionality apart > from the very basic ones in core, install DB (with it's additional > requirements, including core, DBI, and so on). Same with Graphics, > Tools, Tree/Phylo, etc. We just need to define and limit the number > of splits. The same can be achieved with CPAN bundles for each kind of functional grouping you can think of. And since its just a single text file that defines such a grouping, its easy to change or add new ones as you feel like it, as opposed to the rather more permanent and substantial effort of creating one of your splits on the code-base level. Also, the world doesn't have to rely on /our/ ideas of what a useful functional split is. If someone just wants to parse Blast results, they can just use CPAN to install Bio::SearchIO::blast_pull instead of having to install all of SearchIO. > - Easier to add additional bundled modules. For instance, I could > focus all of my RNA work into a discrete set of modules (say, bioperl- > rna) which I maintain, I ensure works with the latest core code, I > ensure also plays well with the other children =) , and I distribute > via CPAN. Same with EUtilities, which could go into a separated DB- > related set or stay in core. And if you lose interest in them? They eventually die because they no longer have someone looking after them by default (the pumpkin and other devs). Alternatively you could just make a CPAN bundle. One text file! Easy! No duplication of modules in CPAN, no new hassle for you or the Bioperl 'core' pumpkin to ensure that the latest version of each work with each other and other splits. > - If we want a full-fledged 'install everything', the CPAN Bundle > system is available. I think it's easier to use a Bundle for 4-5, > even 10 groups of modules as opposed to over 900. No, it isn't any easier. Its /equally/ easy to install a bundle of 900 packages of 900 modules as it is to install 5 packages of 900 modules. When not installing absolutely everything, but perhaps 'most' things, there's the additional benefit that it would be easier to skip a particular Bio::module because you didn't want to install its external dependencies and weren't that interested in it anyway. > - A Bundle or a build file where discrete distributions are listed > (Bio::SearchIO, etc) wouldn't need to be updated every time a new > module is added to a distribution. I suppose this could be > automated, but why have the additional headache? Yes, it would be automated, and no, it wouldn't at all be any kind of additional headache. I'm proposing a fully-automated system that the pumpkin wouldn't even have to think about it. Much /less/ of a headache than dealing with splits. Orders of magnitude easier to deal with. > - A chance to cut out some cruft. We all know that particular areas > need work or a complete overhaul (Restriction, Structure, maybe a few > others). Smaller, concentrated sets of modules I believe would be > easier to maintain, and those that don't get use will eventually fall > out of favor and may be lost or replaced from the more maintained > group of modules. Survival of the fittest. And the smallest, most concentrated set of modules is the individual module. > - We already have had practice; bioperl-db, bioperl-run, bioperl- > network, and others. Those that have been routinely maintained and > enjoy wide use (db, run, network) have survived; others not so much > (corba-related stuff, microarray, ext, etc., though the code is still > available if someone else wants to take it up and revive it!). The reason some of these existing splits (micoarray, ext) have fallen by the way-side? /Because/ they're splits. If they had been part of bioperl-live all along, they'd have been kept in a working, compatible state and would have been released along with everything else in 1.5.2 > Disadvantages of a defined split: > > - The initial headache of identifying which groups go where, > coordinating with those who rely on bioperl (GMOD, etc) on how this > will be set up, so on... No need to worry about this with individual modules. > - Separate groups of modules require testing together to ensure > functionality is consistent and maintained (something I think you > pointed out previously). No need to worry. > - I think an increased possibility of branching is possible. > > - Extra headaches for devs, who have to keep track of the various > critical distributions and make sure they work well together. No headaches. From charles-listes+bioperl at plessy.org Thu Jun 28 03:40:04 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 16:40:04 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? Message-ID: <20070628074004.GD6338@kunpuu.plessy.org> Dear developpers, I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if it would make sense to call it "bioperl-live" and distribute it in parallel with the stable 1.4.0 version, if bioperl-live means "the current developepr version". If I am wrong, can somebody explain me what bioperl-live exactly refers to ? Have a nice day, -- Charles Plessy Debian-med packaging team Wako, Saitama, Japan From n.haigh at sheffield.ac.uk Thu Jun 28 04:23:10 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:23:10 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46836FEE.5030203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. This was my thinking when I first brought this up at the begining/splitting of this thread. This way of thinking of modules as the constituent parts of a larger package should make it easier for people to define dependencies far easier as well as users only needing to install those parts they require. As Sendu points out, if the user wants to convert seqs from genbank to fasta they could simply install Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the other modules that are the dependencies of Bio::SeqIO::genbank and Bio::SeqIO::fasta. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. However, how would the test suite work out with this? e.g. when someone installs Bio::SeqIO::genbank they want to have the tests associated with Bio::SeqIO::genbank to be run. Would there be tests that would be run redundantly if for example someone installed Bio::SeqIO::genbank and Bio::SeqIO::fasta? > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. Yep. real modules are released as modules, each with their own set of dependencies. The use CPAN bundles the way there were supposed to be for - - distributing a set of CPAN modules that make a coherent set of functionality. You "could" also bundle in other authors modules e.g. Bio::ASN1::EntrezGene? > > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. Hmm, how would module versions be handled? Wouldn't this approach require each module to have it's own independent version number, which could then be used for building the dependencies? Each new release of that module would only bump that module's version number. Bundles can specify the minimum version of a module to be installed, such that bug fixes to individual modules and be released into CPAN and would automatically get picked up when installing bundles etc. I'm not quite sure how the current stable/dev releases would work. I assume bug fixes would have to be made on a branch e.g. branch 1.6 and released to cpan from there. Then when the next stable release is made, all module versions would be bumped and and released to CPAN. With any modifications to the content of the bundle to be made. Is it possible to have a stable and developer release bundles that are able to specify the minimum stable and developer modules versions respectively? > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. Maye need to worry aout how the tests are run when installing individual modules etc? > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT VkymyXNshguE44/RilEXWDA= =O5ex -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 04:27:54 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:27:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683710A.9010808@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. > The successor to Bundles - may prove interesting: http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r r/BykCKbM9lqJM0khARuEms= =NB4B -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 04:51:19 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:51:19 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837687.7010101@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Charles Plessy wrote: > Dear developpers, > > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? > > Have a nice day, > bioperl-live really means the HEAD of the cvs repository so is the most bleeding-edge code available. Version 1.5.* is the developer release, while the 1.4.* is the stable release. However, there have been few updates to the 1.4.* release which means that it is more unstable than the 1.5.* dev release. I think the consensus, was to have more rapid release cycles of the stable branch in future in order to avoid this. I'm sure there are others more qualified to expand/correct me on this if needs e. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB /fHFyYkqAvcmOSxu4djPll0= =KwVH -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 05:11:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 10:11:39 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <46836FEE.5030203@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk> Message-ID: <46837B4B.7060705@sendu.me.uk> Nathan S. Haigh wrote: (Please try and snip more: don't quote whole posts just to reply to certain paragraphs) > Sendu Bala wrote: >> Chris Fields wrote: >> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank >> users should have, he could just call './Build dist Bio::SeqIO::genbank' >> which would generate a new package for Bio::SeqIO::genbank suitable for >> uploading to CPAN. No more long release cycles and having to constantly >> tell people to 'use CVS' to get working Bioperl code. > > However, how would the test suite work out with this? e.g. when someone > installs Bio::SeqIO::genbank they want to have the tests associated with > Bio::SeqIO::genbank to be run. Would there be tests that would be run > redundantly if for example someone installed Bio::SeqIO::genbank and > Bio::SeqIO::fasta? We would want to move to a strict test-script-per-module system. But that's desirable in any case, as it would greatly ease reaching our goal of complete test coverage, and subsequent maintenance of those tests. The genbank test would only run tests specific to genbank parsing, and likewise for fasta. They would both have a dependency on Bio::SeqIO, and if that was also recently updated, it would get installed prior to you installing genbank (and therefor run its own generic SeqIO tests), but wouldn't get installed again (wouldn't run its tests again) when you install fasta afterwards. On the subject of tests, I'm reminded of another benefit of the individual-module approach. Currently if a test fails during a CPAN install, nothing gets installed. Users do one of: # refuse to install at all (strict sys-admins) # cry and give up (newbies) # cry and seek help (newbies who really really need Bioperl) # force install, leaving them in some undefined state because they didn't understand the problems (most remaining users) # force install, happy that the problems are ok (some Bioperl devs) With a bundle of individual modules you would install virtually all Bioperl modules with no problems, and the problems with the remainder would be clear to everyone. No one would need to force install since the tests results would now be meaningful: the thing you're trying to install really isn't going to work if the tests are failing. If you really needed that particular Bioperl module you could then pay particular attention to why its failing (most likely some problem with an external dependency). >>> Would they also be individually distributed? What would you use to >>> tie all the individual modules together? >> >> They would be tied together by a CPAN bundle. You don't have to >> 'explain' anything to the CPAN maintainers because you're not doing >> anything wrong. In fact, you're using it the way you're supposed to. > > Yep. real modules are released as modules, each with their own set of > dependencies. The use CPAN bundles the way there were supposed to be for > - - distributing a set of CPAN modules that make a coherent set of > functionality. You "could" also bundle in other authors modules e.g. > Bio::ASN1::EntrezGene? Any bundle featuring Bio::SeqIO::entrezgene would necessarily include Bio::ASN1::EntrezGene in the bundle. > Hmm, how would module versions be handled? Wouldn't this approach > require each module to have it's own independent version number, which > could then be used for building the dependencies? Each new release of > that module would only bump that module's version number. Yes, that's how it would work. No more global version number. > Bundles can specify the minimum version of a module to be installed, > such that bug fixes to individual modules and be released into CPAN and > would automatically get picked up when installing bundles etc. Yes. > I'm not quite sure how the current stable/dev releases would work. I > assume bug fixes would have to be made on a branch e.g. branch 1.6 and > released to cpan from there. Then when the next stable release is made, > all module versions would be bumped and and released to CPAN. With any > modifications to the content of the bundle to be made. Is it possible to > have a stable and developer release bundles that are able to specify the > minimum stable and developer modules versions respectively? No, the distinction becomes pretty meaningless. We could still do big major releases, but modules wouldn't be version-bumped. The big release would just be an update of the bundle that specifies the latest version of all Bioperl modules. Remember that bundles only specify the minimum version, not the required version: in this brave new world users would end up with the same versions of modules if they installed a 1.8 bundle compared to 1.7 bundle. The only way to get a true snapshot of 1.7 after it was released would be if we took snapshots and archived them, making them available from bioperl.org (or by checking out the 1.7 tag from cvs/svn). I don't see that as a significant problem. You lose the trivial benefit of being able to install old snapshots from CPAN. The people who have a great need to install old snapshots can find their way to bioperl.org no problem. From bix at sendu.me.uk Thu Jun 28 04:50:09 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 09:50:09 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837641.8050106@sendu.me.uk> Charles Plessy wrote: > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? bioperl-live is the name of the CVS repository containing what is currently considered the 'Core package' or core modules. http://www.bioperl.org/wiki/Using_CVS If you want to call it something to distinguish it from stable, call it 'developer' vs 'stable' or '1.5.2' vs '1.4.0'. To distinguish them both from the other packages, call them 'core' vs 'run' etc. From hlapp at gmx.net Thu Jun 28 06:31:29 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:31:29 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > [...] Also - the main point I wanted to make - Can I suggest we > spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I agree we need to discuss a path towards 1.6, but I think that should be kept separate from the cvs->svn migration. Otherwise one stalls the other (by stopping people who seem to have the energy and motivation right now to do one but not the other) for no really good reason. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? I'm not sure that's feasible to be happening but if someone steps up it maybe it is. > > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I agree. I also don't think that people are partitioning right now (other than the existing partitioning), though maybe I'm mistaken. > [...] > It would probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Possibly. I'm not fully sure why those modules couldn't also be released more often out of the "main trunk" of modules. In Java/ant, it'd be relatively easy to write build script filters that select the appropriate modules and package them on the fly. I'm not sure whether the build tools for Perl can do that too, though. > Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I believe FeatureIO has the ontology download tied into it? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Jun 28 06:47:39 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:47:39 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote: > As I tried to ask for in the past, would someone also illustrate the > importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the future > the arguments are already laid out. I am basically fine with it, but > I don't honestly see a compelling reason beyond what has been > mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN I guess at the end of the day svn is just the system of choice for new developers. I've had people tell me who started with svn that cvs seems a lot harder to use. The newer projects are all on svn and for example to integrate Bio::Phylo into BioPerl should become a question of the revision control system. At the end of the day if being on svn makes it easier for new people to contribute it's enough of an argument for me, whether it's rational or not. IMHO, there's two advantages that svn has over cvs. First, directories are versioned, have properties, and generally are the same class of citizens as files. They can be added, renamed, and removed from the repository. In cvs, we all know what a hassle it is to rename or even retire directories. Second, svn log gives you the commits, i.e., the set of changes that constituted one particular commit (and therefore version increase). In cvs that's hard or impossible to reconstruct. Bottom line - I don't think many people if any will question why we moved from cvs to svn ... My $0.02 ... -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 20:34:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:34:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> Message-ID: <18051.541.684705.567954@almost.alerce.com> Chris Fields writes: > We should port them all, yes. > > chris > > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > > > Is there a reason not to port every subproject over? > > > > -hilmar They're all there. At least everything that I found in the CVS repo. Some of the directories were empty, some had very little content, I was just mechanical about it. Here's what I have: [hartzell at dev ~]$ svn ls file://`pwd`/bioperl biodata/ bioperl-cookbook/ bioperl-corba-client/ bioperl-corba-server/ bioperl-das-client/ bioperl-db/ bioperl-ext/ bioperl-gui/ bioperl-live/ bioperl-microarray/ bioperl-network/ bioperl-papers/ bioperl-pedigree/ bioperl-pipeline/ bioperl-run/ biosql-schema/ html/ task-manager/ xml-html/ I wasn't very clear in my original request, but I was hoping that someone out there who's familiar with the various out-of-the-way bits and pieces could take a look at them. I was afraid that everyone was just checking out bioperl-live and doing 'make test'. Someone (chris?) made a point about binary files in bioperl-run. It'd be great if someone in the know could check on them. Also, to the degree that it's possible, look around at various tags and branches and see if they're what you'd expect. Thanks! g. From bix at sendu.me.uk Thu Jun 28 08:21:37 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 13:21:37 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <4683A7D1.8070403@sendu.me.uk> George Hartzell wrote: > Chris Fields writes: > > [...] > > It looks like George Hartzell may be taking a crack at it, with > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > could have something testable relatively soon. After that we'll need > > to work out a few other issues, basically what's on Hilmar's list. > > There's a repository on file:///home/hartzell/bioperl with all of the > components projects in place. > > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl I'm confused. Presumably that only works whilst logged into dev.open-bio.org? > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I just tried: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl on Mac OS X and things seemed to go well, except for this error message at the end: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory I also ended up with only: bioperl-corba-server bioperl-db bioperl-live bioperl-network bioperl-papers biosql-schema Am I doing something totally wrong here? From hartzell at alerce.com Thu Jun 28 08:32:36 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:32:36 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.43620.481558.447399@almost.alerce.com> Jason Stajich writes: > [...] > The repository machine (dev) is a locked down machine meaning it only > really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or 15 > minutes. A great way to provide a read-only mirror of the repos. for anonymous users is to have svnsync running out of cron on code.open-bio.org, configured to pull from the dev.open-bio.org repository. It might actually work to have rsync mirror the fsfs-backed repository, but that's scary-poking-into-the-internals. g. From hartzell at alerce.com Thu Jun 28 08:43:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:43:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18051.44281.831316.749586@almost.alerce.com> David Messina writes: > > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > > > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > > > >> I would think we would want "Author Date Id Rev URL" set on > >> everything, no?. So either cvs2svn or your tool (whichever you think > >> is better), followed by > >> > >> svn propset svn:keywords "Author Date Id Rev URL" * > > > > Shouldn't this be done recursively? > > > Yep, good catch! Thanks, Hilmar. > > Should be: > > svn propset --recursive svn:keywords "Author Date Id Rev URL" * That's not quite what you want either. It'll set the the keyword property on all of the files, including things where you probably don't want expansion to happen (e.g. images, someone said there are binary wads in bioperl-run, etc...). The Right Thing To Do is to grub around (grep) for '\$Id:' (and the others) and set svn:keywords to files that are already using keywords. I have a bourne shell hack that'll do this, although it's painful because it has to run in working directories.... Once we settle on a list of keywords to use, I'll take a wack at the demo repository. Likewise, you probably DON'T want to use this in your config file: enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" since it'll do the same thing. The Right Thing To Do is a more tedious *.pl = svn:keywords="Author Date Id Rev URL" *.pm = svn:keywords="Author Date Id Rev URL" *.c = svn:keywords="Author Date Id Rev URL" A bit of googling will give you a good starting point for the list, and we should probably maintain a common one somewhere in the repo. I don't think that there's a server side way of doing this, short of running some script via a hook around commit time. g. From hartzell at alerce.com Thu Jun 28 08:54:40 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:54:40 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.44944.982207.37624@almost.alerce.com> Hilmar Lapp writes: > [...] > IMHO, there's two advantages that svn has over cvs. First, > directories are versioned, have properties, and generally are the > same class of citizens as files. They can be added, renamed, and > removed from the repository. In cvs, we all know what a hassle it is > to rename or even retire directories. Second, svn log gives you the > commits, i.e., the set of changes that constituted one particular > commit (and therefore version increase). In cvs that's hard or > impossible to reconstruct. Two more: - svn groups changes into revisions, so that they can be considered together, CVS versions individual files. - subversion tracks renames/moves correctly, - subversion commits are atomic, so you never have to worry about all of your stuff making it into the repos. at the same time [if you've never had to un-muck this, count yourself blessed!] , - svk, which allows disconnected development while still commiting your work to a repo at natural points along the way (you can revert, branch, etc.... to your hearts content). [yeah, that's 3, err, 4. Math is hard.] g. From cjfields at uiuc.edu Thu Jun 28 09:07:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:07:24 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu> On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > >> ...It >> seems like we really need to do this first so that we have a stable >> release that can be followed by CVS -> SVN migration, then consider >> major changes to the repository structure and release packaging, and >> potential deprecation and incorporation of other modules. > > I agree we need to discuss a path towards 1.6, but I think that > should be kept separate from the cvs->svn migration. Otherwise one > stalls the other (by stopping people who seem to have the energy and > motivation right now to do one but not the other) for no really good > reason. It's good to discuss it as long as it doesn't take time and energy away from other priorities. >> I assume there is no chance that we'd have a 1.6 candidate by BOSC >> next month? > > I'm not sure that's feasible to be happening but if someone steps up > it maybe it is. Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after. Then maybe work on partitioning if everyone's up for it and a scheme is worked out. >> Will it be productive to schedule a fair amount of time at BOSC >> discussing how to partition out the packages into separate sub- >> packages after we've done a successful release rather than trying to >> change things right now? > > I agree. I also don't think that people are partitioning right now > (other than the existing partitioning), though maybe I'm mistaken. The original proposal was based on Steve's idea of splitting up core. I don't think a partition is feasible at this point, at least until we put more thought into it (our energy should be focused elsewhere), but it's well worth discussing as a future path. At this time there are two proposals: 1) Steve's and my 'split into discrete sections' proposal, where we split core into self-sustaining sections with a common core listed as a dependency, tying installation of all together with a Bundle or similar. 2) Sendu's 'break everything up' approach where all modules are submitted independently to CPAN, with their own tests, dependencies, etc. There are advantages and disadvantages to both approaches. Not sure if CPAN would go for the latter (it's pretty drastic), but I don't know for sure. If you want in on that discussion (in this thread) feel free to join in! The more the merrier! >> [...] >> It would probably mean moving Bio::Graphics, Bio::DB::GFF and >> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages >> so they could be released more regularly on par with Gbrowse >> schedules. > > Possibly. I'm not fully sure why those modules couldn't also be > released more often out of the "main trunk" of modules. In Java/ant, > it'd be relatively easy to write build script filters that select the > appropriate modules and package them on the fly. I'm not sure whether > the build tools for Perl can do that too, though. Both approaches above would probably use Module::Build to install other bioperl dependencies, each of which could have it's own dependency set, possibly using a Bundle to tie everything together. >> Also I think someone needs to figure out Bio::Tools::GFF >> vs Bio::FeatureIO -- what do we want to do? > > I believe FeatureIO has the ontology download tied into it? > > -hilmar From recent posts here and on the gbrowse mail list by Scott and Lincoln, it seemed like they were moving away from using Bio::DB::GFF and were trying to get users to switch to Bio::DB::SeqFeature. Maybe should get a more direct response? chris From hartzell at alerce.com Thu Jun 28 09:16:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:16:18 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.46242.942184.758493@almost.alerce.com> Sendu Bala writes: > George Hartzell wrote: > > Chris Fields writes: > > > [...] > > > It looks like George Hartzell may be taking a crack at it, with > > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > > could have something testable relatively soon. After that we'll need > > > to work out a few other issues, basically what's on Hilmar's list. > > > > There's a repository on file:///home/hartzell/bioperl with all of the > > components projects in place. > > > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, that only works if you're actually on the machine. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? It looks like you tried to check out the *entire* repository. It never occured to me to try that. I'll take a look at what you reported. g. From bix at sendu.me.uk Thu Jun 28 09:20:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:20:19 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.46242.942184.758493@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> Message-ID: <4683B593.3050108@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: >> I just tried: >> >> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl [snip] > It looks like you tried to check out the *entire* repository. Yes. If you don't want everything, how does one 'browse' the repository to find out the address of the thing you /do/ want? > It never occured to me to try that. I'll take a look at what you > reported. Cheers. From bix at sendu.me.uk Thu Jun 28 09:27:29 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:27:29 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <4683B741.5020600@sendu.me.uk> George Hartzell wrote: > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? It would be great to have the following files svn:ignored : In all package roots: ? Build ? MANIFEST ? MANIFEST.SKIP ? META.yml ? _build ? bioperl-*.tar.bz2 ? bioperl-*.tar.gz ? bioperl-*.zip ? blib ? cover_db In any and all directories: ? .DS_Store ? .DAV In bioperl-live: ? t/BioDBSeqFeature.t ? t/BioDBSeqFeature_BDB.t ? t/BioDBSeqFeature_mysql.t Can't think of anything else right now. Thanks for your efforts, Sendu. From cjfields at uiuc.edu Thu Jun 28 09:30:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:30:43 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote: >> ... >> file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, it's just a tester. >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/trunk /mybiodir' to check out the main trunk for core. chris From hartzell at alerce.com Thu Jun 28 09:57:00 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:57:00 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.48684.996884.134046@almost.alerce.com> Sendu Bala writes: > [...] > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? So, you probably wanted something like svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk to pick up the head of the bioperl live tree (or /.../bioperl-run/trunk, etc...). I just checked out svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ and it ran to completion and gave me (delicious)[6:50am]~/tmp>>ls bioperl | cat biodata bioperl-cookbook bioperl-corba-client bioperl-corba-server bioperl-das-client bioperl-db bioperl-ext bioperl-gui bioperl-live bioperl-microarray bioperl-network bioperl-papers bioperl-pedigree bioperl-pipeline bioperl-run biosql-schema html task-manager xml-html Can another mac os x user out there give the Great Big Checkout a try and see if it runs to completion. Potential problems that come to mind are: - the "mac's are case insensitive, sort of" problem - you filled up your disk - something else. g. From charles-listes+bioperl at plessy.org Thu Jun 28 09:44:56 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 22:44:56 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <46837687.7010101@sheffield.ac.uk> References: <20070628074004.GD6338@kunpuu.plessy.org> <46837687.7010101@sheffield.ac.uk> Message-ID: <20070628134456.GB14492@kunpuu.plessy.org> Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit : > > Version 1.5.* is the developer release, while the 1.4.* is the stable > release. However, there have been few updates to the 1.4.* release which > means that it is more unstable than the 1.5.* dev release. I think the > consensus, was to have more rapid release cycles of the stable branch in > future in order to avoid this. I'm sure there are others more qualified > to expand/correct me on this if needs e. Ok, thank you all for the answers. I think that I will simply upgrade bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core when I will package other components. Have a nice day, -- Charles Plessy Debian-Med packaging team Wako, Saitama, Japan From bix at sendu.me.uk Thu Jun 28 10:19:49 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 15:19:49 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.48684.996884.134046@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> Message-ID: <4683C385.3050904@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: > > [...] > > I just tried: > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > > > on Mac OS X and things seemed to go well, except for this error message > > at the end: > > > > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > > svn: Can't move source to dest > > svn: Can't move > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > > to > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > > No such file or directory > > > > I also ended up with only: > > bioperl-corba-server bioperl-db bioperl-live > > bioperl-network bioperl-papers biosql-schema I tried again in the same location and it told me I had to 'svn cleanup', which I did. But subsequently it kept complaining about files already being there. > I just checked out > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ > > and it ran to completion [snip] > Can another mac os x user out there give the Great Big Checkout a try > and see if it runs to completion. Potential problems that come to > mind are: > > - the "mac's are case insensitive, sort of" problem > - you filled up your disk > - something else. Well, I didn't run out of disc space. After a rm -fr * and trying again it failed at exactly the same point, in the same way. svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data causes this repeatable problem: [...] A data/phredfile.phd svn: In directory 'data' svn: Can't move source to dest svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory That is with Mac OS X svn command-line client, version 1.4.4 I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with a linux svn command-line client, version 1.2.3. Cheers, Sendu. From dmessina at wustl.edu Thu Jun 28 11:08:59 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:08:59 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.44281.831316.749586@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: > [George] > Likewise, you probably DON'T want to use this in your config file: > > enable-auto-props = yes > * = svn:keywords="Author Date Id Rev URL" > > since it'll do the same thing. Ah, so I've been doing it wrong all along then. :) Thanks, George! > The Right Thing To Do is a more tedious > > *.pl = svn:keywords="Author Date Id Rev URL" > *.pm = svn:keywords="Author Date Id Rev URL" > *.c = svn:keywords="Author Date Id Rev URL" > > A bit of googling will give you a good starting point for the list, > and we should probably maintain a common one somewhere in the repo. I've googled around and gathered the following as a possible list for our repo. Since I obviously don't know what I'm doing :), of course adjust and refine as necessary. Dave ------- [auto-props] # Code formats *.c = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cpp = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.h = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.java = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.as = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cgi = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn-mine-type=text/plain *.js = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/javascript *.php = svn:eol-style=native; svn:keywords="Author Date Id Rev URL" Rev Date; svn:mime-type=text/x-php *.pl = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl; svn:executable *.pm = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl *.py = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-python; svn:executable *.sh = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-sh; svn:executable # Image formats *.bmp = svn:mime-type=image/bmp *.gif = svn:mime-type=image/gif *.ico = svn:mime-type=image/ico *.jpeg = svn:mime-type=image/jpeg *.jpg = svn:mime-type=image/jpeg *.png = svn:mime-type=image/png *.tif = svn:mime-type=image/tiff *.tiff = svn:mime-type=image/tiff # Data formats *.pdf = svn:mime-type=application/pdf *.avi = svn:mime-type=video/avi *.doc = svn:mime-type=application/msword *.eps = svn:mime-type=application/postscript *.gz = svn:mime-type=application/gzip *.mov = svn:mime-type=video/quicktime *.mp3 = svn:mime-type=audio/mpeg *.ppt = svn:mime-type=application/vnd.ms-powerpoint *.ps = svn:mime-type=application/postscript *.psd = svn:mime-type=application/photoshop *.rtf = svn:mime-type=text/rtf *.swf = svn:mime-type=application/x-shockwave-flash *.tgz = svn:mime-type=application/gzip *.wav = svn:mime-type=audio/wav *.xls = svn:mime-type=application/vnd.ms-excel *.zip = svn:mime-type=application/zip # Text formats .htaccess = svn:mime-type=text/plain *.css = svn:mime-type=text/css *.dtd = svn:mime-type=text/xml *.html = svn:mime-type=text/html *.ini = svn:mime-type=text/plain *.sql = svn:mime-type=text/x-sql *.txt = svn:mime-type=text/plain *.xhtml = svn:mime-type=text/xhtml+xml *.xml = svn:mime-type=text/xml *.xsd = svn:mime-type=text/xml *.xsl = svn:mime-type=text/xml *.xslt = svn:mime-type=text/xml *.xul = svn:mime-type=text/xul *.yml = svn:mime-type=text/plain CHANGES = svn:mime-type=text/plain COPYING = svn:mime-type=text/plain INSTALL = svn:mime-type=text/plain Makefile* = svn:mime-type=text/plain README = svn:mime-type=text/plain TODO = svn:mime-type=text/plain From dmessina at wustl.edu Thu Jun 28 11:11:23 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:11:23 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: > [Sendu] > > Yes. If you don't want everything, how does one 'browse' the > repository > to find out the address of the thing you /do/ want? svn ls file://dev.open-bio.org/home/hartzell/bioperl or svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl From n.haigh at sheffield.ac.uk Thu Jun 28 11:13:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:13:58 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: <4683D036.5060109@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > George Hartzell wrote: >> Sendu Bala writes: >>> I just tried: >>> >>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > [snip] >> It looks like you tried to check out the *entire* repository. > > Yes. If you don't want everything, how does one 'browse' the repository > to find out the address of the thing you /do/ want? > You could try: svn ls or svn ls -R to get a list of directories. > >> It never occured to me to try that. I'll take a look at what you >> reported. > > Cheers. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku akLhIszoQbRc/aVX3d/Jp7w= =mlHY -----END PGP SIGNATURE----- From cjfields at uiuc.edu Thu Jun 28 11:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:20:46 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> I can replicate the same problem (Mac OS X) with a full checkout: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory What local (mac) svn version are you using? I'm running off macports: svn --version svn, version 1.4.4 (r25188) compiled Jun 16 2007, 23:40:53 chris On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote: ... > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about > files > already being there. >> > [snip] >> Can another mac os x user out there give the Great Big Checkout a try >> and see if it runs to completion. Potential problems that come to >> mind are: >> >> - the "mac's are case insensitive, sort of" problem >> - you filled up your disk >> - something else. > > Well, I didn't run out of disc space. After a rm -fr * and trying > again > it failed at exactly the same point, in the same way. > > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ > release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or > directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine > with > a linux svn command-line client, version 1.2.3. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Jun 28 11:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:37:27 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Chris Fields wrote: >> ... > > The short and sweet version: my proposal has all the benefits of > yours, but none of the disadvantages. What's not to like? The short and sweet version: I'm more convinced after you laid out your argument in detail, which would have saved me some typing last night, BTW, thanks! ; > The other core devs need to chip in and we need to openly (candidly) discuss it some more (I've added Hilmar to this). There is also a tenable solution that allows both aspects ('cliques' and single mode) which might make everybody happy. Let's say we only want to install Bio::SeqIO::genbank. The Bio::SeqIO::genbank Build.PL would only install what was needed (as you indicated), only Bio::SeqIO::genbank-related tests would run (along with dependency test, if available), and life would go on. However, what if we wanted to install everything in SeqIO/DB/AlignIO/ etc? We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO modules installed or a select few (maybe a quick 'install all (y/n)?' followed by a list, which installs them one at a time along with dependencies), or have the option to specifically denote them as passed args to SeqIO's Build.PL, something like 'perl Build.PL - install-plugins genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a specific module (Bio::SeqIO::genbank) is installed directly then maybe the installation q&a's of followed modules could be bypassed when installing down the dependency tree with additional passed args. This would, in effect, be a bioperl-specific mini-CPAN within CPAN. Nice! Now, this doesn't address several related issues, such as how we handle versioning of the independent modules (should be in a controlled manner), what we do about deprecated modules which linger about on CPAN, how we deal with PPMs/RPMs/packaging, and so on. All have possible reasonable ways they can be addressed, I believe. Also, I think we should still think about doing regular full-scale 'stable' (1.#) releases (sort of our stamp of approval for that batch of modules at that point in time, with a reasonable 'sell-by' date). Again, it should be seriously discussed among the core devs and the bioperl community at large prior to any serious work on it, and it would be quite a large-scale project, but possibly worth it. It can only go forward if there is enough momentum behind it. >> Finally, all of this should wait until later. Much later, like >> after a decent release, after svn, etc kind of 'later'. I think >> we can agree on that. > > Hmm, not really. If it can be implemented by a change in just > Build.PL and ModuleBuildBioperl, its really independent of > everything else. That's the beauty of it: the only thing that > changes is how things are uploaded to and downloaded from CPAN. The > only person that normally deals with that issue is the pumpkin for > a release, and he only cares about it at release time. > > In fact, if we're going to do it at all it makes sense to try it > out on a minor release like 1.5.3. We've already got experience of > doing it split-style from 1.5.2. (And let me tell you: splits at > the code-base level suck.) BOSC is coming up, and I would like to focus on getting svn migration taken care of ASAP (which is sounding more and more like we plan on moving all open-bio over, unless I misread Jason's post?) and stomping of bugs (my next priority after EUtilities). Maybe in the interim we should try focusing on bug squashing, get out a quick standard dev release (1.5.3) before BOSC, and then a few of us could all communicate there via email/text/IM/phone off-list? Maybe post updates via the bioperl blog and list? > And where is the harm in letting them do it via CPAN as well? In > fact, there are significant benefits: ... I'm already pretty convinced... > The same can be achieved with CPAN bundles for each kind of > functional grouping you can think of. And since its just a single > text file that defines such a grouping, its easy to change or add > new ones as you feel like it, as opposed to the rather more > permanent and substantial effort of creating one of your splits on > the code-base level. ... or it could be run right in Module::Build for specific parent classes (as I mention above). Bundling could be instituted for something like a standard GBrowse release (Bundle::BioPerl::GBrowse) where the functionality might be more spread out (Bio::DB*, Bio::Graphics, Bio::FeatureIO, etc). For a full-scale old-style core install, another Bundle (Bundle::BioPerl::Standard). ... > Yes, it would be automated, and no, it wouldn't at all be any kind > of additional headache. I'm proposing a fully-automated system that > the pumpkin wouldn't even have to think about it. Much /less/ of a > headache than dealing with splits. Orders of magnitude easier to > deal with. The 'headache' would be the initial setup (splitting test, individual Build.PL, etc), but this could be done stepwise or section-wise, I suppose. ... > And the smallest, most concentrated set of modules is the > individual module. Well, only if it runs correctly (i.e. has the entire dep. tree installed). But the 'follow' tests would handle that. > The reason some of these existing splits (micoarray, ext) have > fallen by the way-side? /Because/ they're splits. If they had been > part of bioperl-live all along, they'd have been kept in a working, > compatible state and would have been released along with everything > else in 1.5.2 microarray fell out of favor for other reasons (much faster ways to do the same thing via R), though I think it still could be salvaged if someone wanted to take it up. the other bioperl distros (network, db, run, etc) would also necessitate following the same path as core, but I guess they could be bundled as well. > ... > No headaches. I already have one, sorry! chris From n.haigh at sheffield.ac.uk Thu Jun 28 11:53:52 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:53:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683D990.8090909@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> ... >> >> The short and sweet version: my proposal has all the benefits of >> yours, but none of the disadvantages. What's not to like? > > The short and sweet version: I'm more convinced after you laid out your > argument in detail, which would have saved me some typing last night, > BTW, thanks! ; > > > The other core devs need to chip in and we need to openly (candidly) > discuss it some more (I've added Hilmar to this). There is also a > tenable solution that allows both aspects ('cliques' and single mode) > which might make everybody happy. Couldn't "cliques" simply be satisfied with CPAN Bundles? > > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? I think this might be where Bundles come in for installing these "cliques" of related modules? - -- snip -- > >> Yes, it would be automated, and no, it wouldn't at all be any kind of >> additional headache. I'm proposing a fully-automated system that the >> pumpkin wouldn't even have to think about it. Much /less/ of a >> headache than dealing with splits. Orders of magnitude easier to deal >> with. > > The 'headache' would be the initial setup (splitting test, individual > Build.PL, etc), but this could be done stepwise or section-wise, I suppose. Yes, I think this is where most of the labour will be. However, setting the test suite up like this would be beneficial with or without publishing modules individually. - -- snip -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg UKE/Q/wA3gu1Gb7S6rarCQw= =WQdY -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 12:03:54 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 17:03:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683DBEA.90005@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? > > We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO > modules installed or a select few (maybe a quick 'install all (y/n)?' > followed by a list, which installs them one at a time along with > dependencies), or have the option to specifically denote them as passed > args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins > genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a > specific module (Bio::SeqIO::genbank) is installed directly then maybe > the installation q&a's of followed modules could be bypassed when > installing down the dependency tree with additional passed args. I'd probably stay away from something like this. My primary reason being, off-the-top-of-my-head I don't see how to get it to work. If you're installing Bio::SeqIO for the first time via CPAN you can't ask it to install Bio::SeqIO::genbank et al. at the same time because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity. I also wouldn't want these things to be complicated. There should be little in the way of questions to ask during install. Each module's Build.PL should be ultra-simple with no advanced logic at all. It should just specify things that are absolute requirements. This simplicity helps avoid some of the problems we face by distributing the monolithic Bioperl. No, much better for us and for users to provide a Bundle::Bio-SeqIO. > Now, this doesn't address several related issues, such as how we handle > versioning of the independent modules (should be in a controlled > manner), When a module is changed, it gets a version bump. Nothing complicated needs to be done. Transparent and obvious, behaving like all other CPAN modules would be my choice. > what we do about deprecated modules which linger about on CPAN, Delete them from CPAN seems appropriate. > how we deal with PPMs/RPMs/packaging, and so on. All have possible > reasonable ways they can be addressed, I believe. Also, I think we > should still think about doing regular full-scale 'stable' (1.#) > releases (sort of our stamp of approval for that batch of modules at > that point in time, with a reasonable 'sell-by' date). Yes, we can still choose to take a snapshot and announce it to the world, but at the module-level nothing special would happen. There would just be an updated Bundle::Bioperl-everything (or whatever). > Again, it should be seriously discussed among the core devs and the > bioperl community at large prior to any serious work on it, and it would > be quite a large-scale project, but possibly worth it. It can only go > forward if there is enough momentum behind it. The requirement for this approach is per-module test scripts. Which as I identified already, is very desirable anyway so we can hit 100% test coverage. So, regardless of anything else can we all agree that per-module test scripts are a good idea and should be worked on? If so, I'll look into the feasibility and figure out how much work will be involved. From cjfields at uiuc.edu Thu Jun 28 13:17:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 12:17:50 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683DBEA.90005@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > ... > I'd probably stay away from something like this. My primary reason > being, off-the-top-of-my-head I don't see how to get it to work. If > you're installing Bio::SeqIO for the first time via CPAN you can't > ask it to install Bio::SeqIO::genbank et al. at the same time > because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some > circularity. True... > I also wouldn't want these things to be complicated. There should > be little in the way of questions to ask during install. Each > module's Build.PL should be ultra-simple with no advanced logic at > all. It should just specify things that are absolute requirements. > This simplicity helps avoid some of the problems we face by > distributing the monolithic Bioperl. > > No, much better for us and for users to provide a Bundle::Bio-SeqIO. I just don't want too much Bundle-itis as it'll gets confusing for newbie (i.e. Vista-itis, or AdobeCS-itis). It should be limited to functional grouping (SeqIO, AlignIO, DB, etc), 'install everything', or distribution-specific (GBrowse). I also think (though Hilmar may veto this) that we should work on integrating bioperl-db, network, etc. into this if it goes forward. Here's a question: how do we plan on handling uploading bioperl updates to CPAN via PAUSE? Do we want to run every single module through one pumpkin? Or do we want to have a core dev group PAUSE account? I can see, for instance, removing everything EUtilities- related and submitting it independently using my own PAUSE account, but it would be nice to have it under an umbrella 'bioperl-devs' account instead. >> Now, this doesn't address several related issues, such as how we >> handle versioning of the independent modules (should be in a >> controlled manner), > > When a module is changed, it gets a version bump. Nothing > complicated needs to be done. Transparent and obvious, behaving > like all other CPAN modules would be my choice. > >> what we do about deprecated modules which linger about on CPAN, > > Delete them from CPAN seems appropriate. I know you can do that via PAUSE, but I think it lingers about on search.cpan.org (unless that's been fixed). This would prob. have to be used sparingly. >> how we deal with PPMs/RPMs/packaging, and so on. All have >> possible reasonable ways they can be addressed, I believe. Also, >> I think we should still think about doing regular full-scale >> 'stable' (1.#) releases (sort of our stamp of approval for that >> batch of modules at that point in time, with a reasonable 'sell- >> by' date). > > Yes, we can still choose to take a snapshot and announce it to the > world, but at the module-level nothing special would happen. There > would just be an updated Bundle::Bioperl-everything (or whatever). Right, it would basically be a stamp of certification. >> Again, it should be seriously discussed among the core devs and >> the bioperl community at large prior to any serious work on it, >> and it would be quite a large-scale project, but possibly worth >> it. It can only go forward if there is enough momentum behind it. > > The requirement for this approach is per-module test scripts. Which > as I identified already, is very desirable anyway so we can hit > 100% test coverage. > > So, regardless of anything else can we all agree that per-module > test scripts are a good idea and should be worked on? If so, I'll > look into the feasibility and figure out how much work will be > involved. I think so, but the feasibility issue is critical. Do we want cvs/ svn to be divided up into 900 subdirectories (one for each module), or do we want to have a similar directory structure as we have now, but with each module in it's own directory? Or leave everything as is and generate Build.PL on-the-fly (prob. least feasible)? This is where it might be wise to do it piece-meal at first (maybe starting with something somewhat segregated like Bio::Tools), then progress from there. chris From hartzell at alerce.com Thu Jun 28 13:38:48 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 13:38:48 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: <18051.61992.627473.323346@almost.alerce.com> David Messina writes: > > [George] > > Likewise, you probably DON'T want to use this in your config file: > > > > enable-auto-props = yes > > * = svn:keywords="Author Date Id Rev URL" > > > > since it'll do the same thing. > > Ah, so I've been doing it wrong all along then. :) Thanks, George! It's not *wrong* if it's never done anything to you that you've regretted. The right answer depends on your situation.... > [...] > I've googled around and gathered the following as a possible list for > our repo. Since I obviously don't know what I'm doing :), of course > adjust and refine as necessary. > That's a great starting point. Do you have write access to the wiki? Could you link it off of the instructions for using svn? g. From hartzell at alerce.com Thu Jun 28 14:06:50 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 14:06:50 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <18051.63674.685297.426813@almost.alerce.com> Sendu Bala writes: > [...] > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about files > already being there. You need to do the cleanup because svn exited gracelessly and you needed to help it get back in it's feet. The cleanup doesn't remove the stuff that you did get checked out, so it's still there getting in the way of your new checkout. > [...] > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with > a linux svn command-line client, version 1.2.3. I'm not 100% sure what's going on here, but I'm inclined to say "get a real computer" (and yes, I'm typing this on a mac...). I have a mac pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony the tiger used to say).... I think that we're having trouble with case sensitivity. My only evidence is that I can see where there have been both HUMBETGLOA.FASTA and HUMBETGLOA.fasta in the tree at various times. I can't figure out anything else that's weird about that file. On the other hand, I can't see how this would cause the error you're seeing though. The experiment would be to grab a usb or firewire disk (or even a memory stick), partition/format it as case sensitive (or even *unix*) and try to do svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data into it. If it works, voila. If not, I'll keep making stuff up, err, thinking about it. g. From dmessina at wustl.edu Thu Jun 28 14:15:32 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:15:32 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu> Same svn error here on the full checkout. > What local (mac) svn version are you using? I'm running off macports: > > svn --version > svn, version 1.4.4 (r25188) > compiled Jun 16 2007, 23:40:53 I have svn 1.4.3. % svn --version svn, version 1.4.3 (r23084) compiled Apr 1 2007, 02:47:14 Copyright (C) 2000-2006 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ This product includes software developed by CollabNet (http:// www.Collab.Net/). The following repository access (RA) modules are available: * ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' scheme * ra_svn : Module for accessing a repository using the svn network protocol. - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme From cjfields at uiuc.edu Thu Jun 28 14:54:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 13:54:15 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.63674.685297.426813@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > ... > I'm not 100% sure what's going on here, but I'm inclined to say "get a > real computer" (and yes, I'm typing this on a mac...). I have a mac > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > the tiger used to say).... Ouch! Though it could be worse (**coughwindowscough**). > I think that we're having trouble with case sensitivity. My only > evidence is that I can see where there have been both HUMBETGLOA.FASTA > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > anything else that's weird about that file. On the other hand, I > can't see how this would cause the error you're seeing though. Odd that other branches (including the main trunk) work but that one doesn't. > The experiment would be to grab a usb or firewire disk (or even a > memory stick), partition/format it as case sensitive (or even *unix*) > and try to do > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > > into it. If it works, voila. If not, I'll keep making stuff up, err, > thinking about it. > > g. I'll have to figure out why I can't get ssh keys to work locally to test it out more (I have a usb drive to test with); just don't have time at the moment. chris From dmessina at wustl.edu Thu Jun 28 14:47:04 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:47:04 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu> > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? Done. http://www.bioperl.org/wiki/Svn_auto-props linked from: http://www.bioperl.org/wiki/Using_Subversion (bottom of page) From bix at sendu.me.uk Thu Jun 28 15:19:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 20:19:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> Message-ID: <468409C7.7020102@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > Here's a question: how do we plan on handling uploading bioperl > updates to CPAN via PAUSE? Do we want to run every single module > through one pumpkin? Or do we want to have a core dev group PAUSE > account? I can see, for instance, removing everything EUtilities- > related and submitting it independently using my own PAUSE account, > but it would be nice to have it under an umbrella 'bioperl-devs' > account instead. All Bioperl modules (except the Bundle!) are owned by BIOPERLML on PAUSE. Its a little akward since PAUSE is uploader-centric, but see my notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release And certainly, everything that wants to consider itself part of Bioperl (and gain the benefit of lots of devs looking after it) should certainly have BIOPERLML as the primary owner. > I think so, but the feasibility issue is critical. Do we want cvs/ > svn to be divided up into 900 subdirectories (one for each module), > or do we want to have a similar directory structure as we have now, > but with each module in it's own directory? Or leave everything as > is and generate Build.PL on-the-fly (prob. least feasible)? Very definitely the latter. The key benefit of my approach is that the organisation stays as is and that a snapshot of the repository remains a single directory of modules in Bio so that people don't have to 'install' Bioperl, they can still just uncompress the archive (or check out the package from svn) and point their PERL5LIB to the root dir of the package. For that reason I very much like the idea of folding the current split-out packages (run, network etc.) back into the core package so everything is one place. Folding them back in should obviously wait until everything is in place and working with core already. My proposal obviously wasn't very clear. As far as all other devs are concerned, nothing changes at all (except for lots of new improved test scripts). The pumpkin will, however, be able to say: ./Build dist Right now that generates the distribution archives (in different compression formats) - one big archive containing everything. My proposal is simply that instead it generates lots of archives, one archive per module. It will also generate some Bundles and whatever else might be needed. I don't envisage any major difficulties in achieving this. The 'feasibility' issue I was going to look into was strictly regarding doing all the new test scripts. From hartzell at alerce.com Thu Jun 28 15:43:38 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 15:43:38 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: <18052.3946.224905.415905@almost.alerce.com> Chris Fields writes: > > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > > > ... > > I'm not 100% sure what's going on here, but I'm inclined to say "get a > > real computer" (and yes, I'm typing this on a mac...). I have a mac > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > > the tiger used to say).... > > Ouch! Though it could be worse (**coughwindowscough**). > > > I think that we're having trouble with case sensitivity. My only > > evidence is that I can see where there have been both HUMBETGLOA.FASTA > > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > > anything else that's weird about that file. On the other hand, I > > can't see how this would cause the error you're seeing though. > > Odd that other branches (including the main trunk) work but that one > doesn't. > > > The experiment would be to grab a usb or firewire disk (or even a > > memory stick), partition/format it as case sensitive (or even *unix*) > > and try to do > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > > live/tags/release-0-9-2/t/data > > > > into it. If it works, voila. If not, I'll keep making stuff up, err, > > thinking about it. > > > > g. > > I'll have to figure out why I can't get ssh keys to work locally to > test it out more (I have a usb drive to test with); just don't have > time at the moment. I just did the experiment, and filename-insensitivity seems to be breaking something. I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. I reformatted a memory stick to be case sensitive and co of bioperl/bioperl-live/tags/release-0-9-2/t worked, then I made a directory in my home dir (normal mac thing) and got the same error as above. I can get a copy of the trunk, so I'm inclined to ask someone to mention the problem on the wiki and then just ignore it. g. From cjfields at uiuc.edu Thu Jun 28 16:29:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 15:29:09 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu> On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: >> Here's a question: how do we plan on handling uploading bioperl >> updates to CPAN via PAUSE? Do we want to run every single module >> through one pumpkin? Or do we want to have a core dev group PAUSE >> account? I can see, for instance, removing everything EUtilities- >> related and submitting it independently using my own PAUSE account, >> but it would be nice to have it under an umbrella 'bioperl-devs' >> account instead. > > All Bioperl modules (except the Bundle!) are owned by BIOPERLML on > PAUSE. Its a little akward since PAUSE is uploader-centric, but see my > notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release > > And certainly, everything that wants to consider itself part of > Bioperl > (and gain the benefit of lots of devs looking after it) should > certainly > have BIOPERLML as the primary owner. Alrighty then. >> I think so, but the feasibility issue is critical. Do we want cvs/ >> svn to be divided up into 900 subdirectories (one for each module), >> or do we want to have a similar directory structure as we have now, >> but with each module in it's own directory? Or leave everything as >> is and generate Build.PL on-the-fly (prob. least feasible)? > > Very definitely the latter. The key benefit of my approach is that the > organisation stays as is and that a snapshot of the repository > remains a > single directory of modules in Bio so that people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Okay, makes sense. > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. I agree, but that's up to Brian, Hilmar, and the others who donated the packages (or at least a consensus of core devs). One thing at a time. > My proposal obviously wasn't very clear. As far as all other devs are > concerned, nothing changes at all (except for lots of new improved > test > scripts). The pumpkin will, however, be able to say: > > ./Build dist > > Right now that generates the distribution archives (in different > compression formats) - one big archive containing everything. > My proposal is simply that instead it generates lots of archives, one > archive per module. It will also generate some Bundles and whatever > else > might be needed. We'll need to define which tests and data goes with each module and so on. > I don't envisage any major difficulties in achieving this. The > 'feasibility' issue I was going to look into was strictly regarding > doing all the new test scripts. Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 is ready to go. We'll still need to get thoughts on this from other core devs out there, and it prob. should until everybody is comfortable with the idea. chris From dmessina at wustl.edu Thu Jun 28 18:13:48 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 17:13:48 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: Coming late to this party, I'm replying to snippets from multiple emails. > [Chris] > what we do about deprecated modules which linger > about on CPAN > [Sendu] > Delete them from CPAN seems appropriate. I coulda sworn this was frowned upon, but a recent thread suggests it's totally kosher. http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html > [Sendu] > So, regardless of anything else can we all agree that per-module test > scripts are a good idea and should be worked on? I agree. > [Sendu] > people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Could you elaborate a bit on how this works? How is XS code that needs compiling handled? Or the scripts directory? I would love to be able to do this. > [Sendu] > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. From an organizational standpoint, I'm concerned that with ~900 modules in core right now, adding all of the additional stuff from the split-out packages would make for a daunting directory. But as you said, this is way down the road, so this proposal doesn't bear on the other, closer-to-now issues on the table. > [Chris] > Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 > is ready to go. We'll still need to get thoughts on this from other > core devs out there, and it prob. should until everybody is > comfortable with the idea. If we go forward with the CPAN split plan, I like the idea of having a trial. We can foresee some of the issues that such a change may bring, and yet still more no doubt wait for us once we do it. Dave From bix at sendu.me.uk Thu Jun 28 18:59:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 23:59:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46843D57.2080409@sendu.me.uk> David Messina wrote: >> people don't have to 'install' Bioperl, they can still just >> uncompress the archive (or check out the package from svn) and >> point their PERL5LIB to the root dir of the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to be > able to do this. I meant for the most part. Core doesn't have any XS code so that's not an issue. Scripts can be run manually like any other perl script. When you discover something isn't working because of a missing external dependency, you just install it. (But that happens very rarely.) Personally I've /never/ installed Bioperl and used that installed set of modules. I've always just pointed my PERL5LIB at the distribution folder or my cvs checkout. Which makes me a strange candidate for advocating all these CPAN-specific changes, but there you go ;) From cjfields at uiuc.edu Thu Jun 28 19:03:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 18:03:02 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu> On Jun 28, 2007, at 5:13 PM, David Messina wrote: > Coming late to this party, I'm replying to snippets from multiple > emails. > > >> [Chris] >> what we do about deprecated modules which linger >> about on CPAN > >> [Sendu] >> Delete them from CPAN seems appropriate. > > I coulda sworn this was frowned upon, but a recent thread suggests > it's totally kosher. > > http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html As long as it doesn't show up somewhere to confuse newbies I'm okay with it. >> [Sendu] >> people don't have to >> 'install' Bioperl, they can still just uncompress the archive (or >> check >> out the package from svn) and point their PERL5LIB to the root dir of >> the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to > be able to do this. Maybe Sendu can add to this, but the XS code is limited to bioperl- ext AFAIK. We could keep that separate until it plays well with bioperl itself. Scripts and examples - maybe packaged along with a Bundle? >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal > doesn't bear on the other, closer-to-now issues on the table. Well, the code in bioperl-db and network complement code in core, so I agree with Sendu they belong there. They should be under the same scrutiny as the rest anyway (code, tests, etc), but won't be bundled unles there is an 'install everything' Bundle. >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of > having a trial. We can foresee some of the issues that such a > change may bring, and yet still more no doubt wait for us once we > do it. That's what branches are for; testing stuff out like this. chris From hartzell at alerce.com Thu Jun 28 19:05:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 19:05:32 -0400 Subject: [Bioperl-l] problem with binary files. Message-ID: <18052.16060.932502.183552@almost.alerce.com> Ok, after pointing out the problem with setting the svn:keywords property on binary files, it turns out that I *did* that. Worse yet, I set the svn:eol-style to 'native' on everything, including binary files, so depending on your platform they're likely to be fubar. For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may not be what you expect it to be, depending on whether your eol-style matches the servers and whether any conversions were done. I'll touch up the way that the little tool I'm using calls cvs2svn and redo the repository. g. From n.haigh at sheffield.ac.uk Fri Jun 29 02:59:21 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 07:59:21 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4684ADC9.8040404@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- split -- >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal doesn't > bear on the other, closer-to-now issues on the table. > I don't think this is an issue - it would simply mean everything is under the same version control hierarchy. And with svn it's Soooooo much easier to fiddle around with directory structures > > >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of having > a trial. We can foresee some of the issues that such a change may > bring, and yet still more no doubt wait for us once we do it. > Under svn it would be easy to make an "svn copy" of run, network etc into a branch of live to test this out. Not that this might be a problem, but: Since we are looking at bioperl-* packages being under the same svn repository, then then "svn copy's" are cheap for disk space. > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6 BCvltmPyWF4ImueYmd7VFAc= =ktl+ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Fri Jun 29 03:05:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 08:05:33 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <4684AF3D.5090907@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: - -- snip -- > > [...] > > I've googled around and gathered the following as a possible list for > > our repo. Since I obviously don't know what I'm doing :), of course > > adjust and refine as necessary. > > > > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? > > g. Don't .t files need adding to the auto-props? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC /Iivb6Lc4/51bUdrTmRQYlE= =V+t2 -----END PGP SIGNATURE----- From sac at bioperl.org Fri Jun 29 04:25:36 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 01:25:36 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> On 6/27/07, Chris Fields wrote: > > On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > > > ... > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > > > or > > > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around with it. Are you using the ssh that comes installed with OSX? If so, I'd recommend installing openssh from MacPorts. I recall having issues with the stock version which were resolved by using the more up-to-date version you can get via MacPorts. BTW, I haven't been able to check out the new svn repository via svn+ssh:// because I can't get svn to authenticate with an alternative username. My username on dev.open-bio.org differs from what it is on my local machine, so I issue a command such as: steve at localhost $ svn --username sac checkout svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk but I get challenged with: steve at dev.open-bio.org's password: I also tried putting the --username argument after the subcommand, but it still wants to use my local username. I can ssh -l sac into the dev box no problem. Any suggestions? Steve From bix at sendu.me.uk Fri Jun 29 04:52:42 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 29 Jun 2007 09:52:42 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <4684C85A.5030206@sendu.me.uk> Steve Chervitz wrote: > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? Set up your ssh key on the dev machine. I'm also on a machine with the wrong username and it works even without attempting to supply the correct one. It does, however, show the 'Welcome to the new developer system' message 2 or 3 times for every svn+ssh action, which freaks me out a little. From N.Haigh at sheffield.ac.uk Fri Jun 29 05:32:38 2007 From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 10:32:38 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Quoting Steve Chervitz : -- snip -- > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > You could try: svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Nath From dmessina at wustl.edu Fri Jun 29 08:28:26 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 07:28:26 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> > > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. I have the same issue. I set up a stanza in my ~/.ssh/config: Host dev.open-bio.org User dave_messina where dave_messina is my dev.open-bio.org username. From cjfields at uiuc.edu Fri Jun 29 13:00:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 29 Jun 2007 12:00:27 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >> BTW, I haven't been able to check out the new svn repository via >> svn+ssh:// because I can't get svn to authenticate with an >> alternative >> username. > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > Host dev.open-bio.org > User dave_messina > > where dave_messina is my dev.open-bio.org username. I changed to the macports ssh w/o luck. It appears the key is offered up, so maybe the problem is how I have everything set up on dev (though I followed everything on the wiki): .... Contact 'support at open-bio.org' for your new login information. ====================================== debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug1: Next authentication method: publickey debug1: Offering public key: /Users/cjfields/.ssh/id_dsa debug2: we sent a publickey packet, wait for reply debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug2: we did not send a packet, disable method debug1: Next authentication method: password It's odd; I can use passwordless logins for other servers (admittedly Mac servers) w/o problems using ssh keys, but dev.open-bio.org always prompts for a password regardless. My feeling is it's something with my local ssh or sshd config; I'll try fiddling with it to see what happens. Anyone have suggestions? I've lost enough hair as is; don't want to lose more! chris From sac at bioperl.org Fri Jun 29 13:07:45 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 10:07:45 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com> On 6/29/07, Nathan S. Haigh wrote: > Quoting Steve Chervitz : > > -- snip -- > > > BTW, I haven't been able to check out the new svn repository via > > svn+ssh:// because I can't get svn to authenticate with an alternative > > username. My username on dev.open-bio.org differs from what it is on > > my local machine, so I issue a command such as: > > > > steve at localhost $ svn --username sac checkout > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > > > but I get challenged with: > > steve at dev.open-bio.org's password: > > > > I also tried putting the --username argument after the subcommand, but > > it still wants to use my local username. I can ssh -l sac into the dev > > box no problem. Any suggestions? > > [...] > You could try: > svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Bingo. Thanks for the tips, guys. BTW, setting up ssh keys was not the issue, since my key is already set up on the dev machine. The svn --username setting appears to not be operative at the ssh layer. I suspected this might be the case given that the usage info says: $ svn --help co --username arg : specify a username ARG --password arg : specify a password ARG which seemed insecure. I didn't want to send my password in the clear, and didn't know if or whether svn would hand it off to ssh. It wasn't even sending my username to ssh, so I knew something was wrong. These args are probably only intended for accessing local svn repositories, or non-svn+ssh-based checkouts. BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and openssh installed via MacPorts: $ svn --version svn, version 1.4.4 (r25188) compiled Jun 28 2007, 23:51:53 $ ssh -version OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007 Steve From hartzell at alerce.com Fri Jun 29 15:19:31 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 29 Jun 2007 15:19:31 -0400 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: <18053.23363.102371.602742@almost.alerce.com> Chris Fields writes: > > On Jun 29, 2007, at 7:28 AM, David Messina wrote: > > >> > >> BTW, I haven't been able to check out the new svn repository via > >> svn+ssh:// because I can't get svn to authenticate with an > >> alternative > >> username. > > > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > > > Host dev.open-bio.org > > User dave_messina > > > > where dave_messina is my dev.open-bio.org username. > > I changed to the macports ssh w/o luck. It appears the key is > offered up, so maybe the problem is how I have everything set up on > dev (though I followed everything on the wiki): A couple of things to check. - make sure that you put your public key in ~/.ssh/authorized_keys2 (not authorized_keys) - make sure that authorized_keys2 is chmod'ed 600 (644 might be enough...). - make sure that ~/.ssh is chmoded 700. - make sure that your home directory is 755. Then see if it works. You might be able to relax some of those protections a bit, but ssh's uptight about letting other people mess with that data. g. From dmessina at wustl.edu Fri Jun 29 18:47:14 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 17:47:14 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> > [Nathan] > Don't .t files need adding to the auto-props? Yes -- thanks for reminding me. Please feel free to add it to the wiki page. I'll be tweaking it some more later on in any case. Dave From n.haigh at sheffield.ac.uk Sat Jun 30 05:55:56 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 10:55:56 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> Message-ID: <468628AC.9060200@sheffield.ac.uk> David Messina wrote: >> [Nathan] >> Don't .t files need adding to the auto-props? > > Yes -- thanks for reminding me. Please feel free to add it to the wiki > page. I'll be tweaking it some more later on in any case. > > > Dave I noticed this has already been done. I have just been through the t/data dir and added a list of extensions I found (without props). There are some files without extensions, how should these be dealt with? There seems to be a plethora of file naming styles which means there's a pretty long list of non-standard extensions. So at some point someone will commit a new data file with a new extension (often describing what program created the output or the test for which it's intended) that won't be in the auto-props file - can you think of a way around this? Nath From cjfields at uiuc.edu Sat Jun 30 08:48:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 07:48:10 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <18053.23363.102371.602742@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> <18053.23363.102371.602742@almost.alerce.com> Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu> On Jun 29, 2007, at 2:19 PM, George Hartzell wrote: > Chris Fields writes: >> >> On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >>>> >>>> BTW, I haven't been able to check out the new svn repository via >>>> svn+ssh:// because I can't get svn to authenticate with an >>>> alternative >>>> username. >>> >>> I have the same issue. I set up a stanza in my ~/.ssh/config: >>> >>> Host dev.open-bio.org >>> User dave_messina >>> >>> where dave_messina is my dev.open-bio.org username. >> >> I changed to the macports ssh w/o luck. It appears the key is >> offered up, so maybe the problem is how I have everything set up on >> dev (though I followed everything on the wiki): > > A couple of things to check. > > - make sure that you put your public key in ~/.ssh/authorized_keys2 > (not authorized_keys) > > - make sure that authorized_keys2 is chmod'ed 600 (644 might be > enough...). > > - make sure that ~/.ssh is chmoded 700. > > - make sure that your home directory is 755. > > Then see if it works. You might be able to relax some of those > protections a bit, but ssh's uptight about letting other people mess > with that data. > > g. Got it working; it was the permissions on my home dir (the last one). Thanks George! chris From dmessina at wustl.edu Sat Jun 30 11:37:44 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 10:37:44 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <468628AC.9060200@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> > I have just been through the t/data dir and added a list of > extensions I found Thanks! That's a big help. I'll add prop definitions to those shortly. > There are some files without extensions, how should these be dealt > with? If you look in the text files section, there are some files there which don't have extensions, e.g. AUTHORS, BUGS. There's also Makefile.* so we have some flexibility in how svn knows to auto-prop a file. I haven't read up on the details yet to find out how it handles files that match multiple criteria -- it may be dependent simply on the order they're defined. > There seems to be a plethora of file naming styles which means > there's a pretty long list of non-standard extensions. So at some > point someone will commit a new data file with a new extension > (often describing what program created the output or the test for > which it's intended) that won't be in the auto-props file - can you > think of a way around this? Ive been thinking about this a bit. How about this? - We have just "standard" files and extensions (like *.blast, *.fasta) in the auto-props list. - We manually add props for the files that have nonstandard, arbitrary extensions so all the files have now are prop'd. - At some point we rename those nonstandard files to have standard extensions. Especially for the t/data/ files, we'll have to make sure to update the tests that rely on them. - We can have the suggested list of extensions for new files that get added. I don't think we need to strictly enforce this just for the sake of svn (after all, its primary function of version control will work just fine without any properties set), but it would be nice if we could try to keep to it mostly. Many distros come with an /etc/mime.types file which has the list of officially registered MIME types. I found a script that will take this list and convert it into auto-props format. I don't think we need to support *all* of the gazillion filetypes since most of the them our repository will never see, but we certainly could. Dave From dmessina at wustl.edu Sat Jun 30 12:26:27 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 11:26:27 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 10:37 AM, David Messina wrote: > - We manually add props for the files that have nonstandard, > arbitrary extensions so all the files have now are prop'd. Er, that should be - We manually add props for the files that have nonstandard, arbitrary extensions so that all the files now in the repository are prop'd. From n.haigh at sheffield.ac.uk Sat Jun 30 13:25:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 18:25:58 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: <46869226.70203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- snip -- > > >> There seems to be a plethora of file naming styles which means there's >> a pretty long list of non-standard extensions. So at some point >> someone will commit a new data file with a new extension (often >> describing what program created the output or the test for which it's >> intended) that won't be in the auto-props file - can you think of a >> way around this? > > Ive been thinking about this a bit. How about this? > > - We have just "standard" files and extensions (like *.blast, *.fasta) > in the auto-props list. I think the list of seq formats recognised by Bioperl in Bio::SeqIO and Bio::AlignIO would be a good start. As these are likely to be the ones that are sensitive to file format recognition and thus could break tests if renamed. I think a lot of people have used "." in file names as an alternative to a space. I think it would be beneficial to use an underscore "_" in these cases and leave the "." to represent the beginning of the file extension. > > - We manually add props for the files that have nonstandard, arbitrary > extensions so all the files that we currently have now are prop'd. > > - At some point we rename those nonstandard files to have standard > extensions. Especially for the t/data/ files, we'll have to make sure to > update the tests that rely on them. Nice and easy with svn :) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0 pYVvXwxq0lpiGfM09RQ6A1I= =3Lhw -----END PGP SIGNATURE----- From cjfields at uiuc.edu Sat Jun 30 15:11:52 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 14:11:52 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 11:26 AM, David Messina wrote: > > On Jun 30, 2007, at 10:37 AM, David Messina wrote: > >> - We manually add props for the files that have nonstandard, >> arbitrary extensions so all the files have now are prop'd. > > Er, that should be > > - We manually add props for the files that have nonstandard, > arbitrary extensions so that all the files now in the repository are > prop'd. Do we need to define every filetype extension, or can there be a fallback (eg if it isn't on the list or has no extension it's plain text)? chris From hlapp at gmx.net Sat Jun 30 17:26:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 17:26:22 -0400 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > [...] > Very definitely the latter. The key benefit of my approach is that > the organisation stays as is and that a snapshot of the repository > remains a single directory of modules in Bio so that people don't > have to 'install' Bioperl, they can still just uncompress the > archive (or check out the package from svn) and point their > PERL5LIB to the root dir of the package. I think this is absolutely key to keep in mind. Anything without this feature will likely be a non-starter. I don't really have time to follow the discussion let alone participate, so really all I can contribute is to offer some sanity/ reality checks (such as the above). In this sense, I understand a release pumpkin will generate ~900 packages to upload to CPAN? How much hassle is that compared to what uploading a bioperl release means right now? How brittle is all the Build.PL code that will be needed to automate all of this, and how difficult will it be to maintain? For example, if someone adds in 10 new modules, what Build.PL-related work will need to be done? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Sat Jun 30 17:32:52 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 30 Jun 2007 22:32:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <4686CC04.6000403@sendu.me.uk> Hilmar Lapp wrote: > On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > >> [...] >> Very definitely the latter. The key benefit of my approach is that >> the organisation stays as is and that a snapshot of the repository >> remains a single directory of modules in Bio so that people don't >> have to 'install' Bioperl, they can still just uncompress the >> archive (or check out the package from svn) and point their >> PERL5LIB to the root dir of the package. [snip] > In this sense, I understand a release pumpkin will generate ~900 > packages to upload to CPAN? How much hassle is that compared to what > uploading a bioperl release means right now? I'd have to investigate. I did my uploads using the PAUSE website, which for 900 packages would be unfeasible. Will have to see if the process can be automated. > How brittle is all the Build.PL code that will be needed to automate > all of this, and how difficult will it be to maintain? For example, > if someone adds in 10 new modules, what Build.PL-related work will > need to be done? Well, my plan will be that once the work is done, you won't need to touch the Build.PL code again. My intent is that the pumpkin can just type one command and not think about anything. As for the reality, I won't know until I think about it properly and experiment. From hlapp at gmx.net Sat Jun 30 19:36:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 19:36:45 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18052.3946.224905.415905@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > I just did the experiment, and filename-insensitivity seems to be > breaking something. > > I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. > > I reformatted a memory stick to be case sensitive and co of > > bioperl/bioperl-live/tags/release-0-9-2/t > > worked, then I made a directory in my home dir (normal mac thing) and > got the same error as above. You picked up a rename of a file from lower case extension to upper case extension. Unfortunately, there are several months between adding the upper-case and removing the lower-case version. We can reconstruct what happened with this using svn log on the directory (this does not require a checkout): $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ bioperl-live/trunk/t/data Searching for HUMBETGLOA yields the following two commits that added one and removed the other: ------------------------------------------------------------------------ r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines Changed paths: M /bioperl-live/trunk/t/SearchIO.t A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA A /bioperl-live/trunk/t/data/cysprot1.FASTA added tests for FASTA ------------------------------------------------------------------------ r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines Changed paths: A /bioperl-live/trunk/t/data/HUMBETGLOA.fa D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta renaming file to avoid clobbering on windows Unfortunately, both files are in the tag (again, no checkout required): $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta HUMBETGLOA.FASTA HUMBETGLOA.fasta We can remove the offending version from the repository (again, without needing a checkout): $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta I did this, and now the tag checks out fine on OSX. Can anyone confirm? (BTW the ability to operate on the repository w/o needing a checkout is another advantage of svn) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 30 20:40:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 19:40:53 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: Checkout worked for me (Mac OS X) using both: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ tags/release-0-9-2/t/data svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ tags/release-0-9-2/ so removing the offending file worked (good catch!). Haven't run a full co but probably isn't necessary. chris On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > >> I just did the experiment, and filename-insensitivity seems to be >> breaking something. >> >> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. >> >> I reformatted a memory stick to be case sensitive and co of >> >> bioperl/bioperl-live/tags/release-0-9-2/t >> >> worked, then I made a directory in my home dir (normal mac thing) and >> got the same error as above. > > You picked up a rename of a file from lower case extension to upper > case extension. Unfortunately, there are several months between > adding the upper-case and removing the lower-case version. > > We can reconstruct what happened with this using svn log on the > directory (this does not require a checkout): > > $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ > bioperl/bioperl-live/trunk/t/data > > Searching for HUMBETGLOA yields the following two commits that > added one and removed the other: > > ---------------------------------------------------------------------- > -- > r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines > Changed paths: > M /bioperl-live/trunk/t/SearchIO.t > A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA > A /bioperl-live/trunk/t/data/cysprot1.FASTA > > added tests for FASTA > > ---------------------------------------------------------------------- > -- > r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines > Changed paths: > A /bioperl-live/trunk/t/data/HUMBETGLOA.fa > D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta > > renaming file to avoid clobbering on windows > > Unfortunately, both files are in the tag (again, no checkout > required): > > $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta > HUMBETGLOA.FASTA > HUMBETGLOA.fasta > > We can remove the offending version from the repository (again, > without needing a checkout): > > $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta > > I did this, and now the tag checks out fine on OSX. Can anyone > confirm? > > (BTW the ability to operate on the repository w/o needing a > checkout is another advantage of svn) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 30 20:48:06 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 30 Jun 2007 17:48:06 -0700 Subject: [Bioperl-l] Take 2 of the new subversion repository. Message-ID: <18054.63942.316904.413911@almost.alerce.com> There's a second cut at the subversion repository. I've done a better job of setting svn:keywords and svn:eol-style on various files. The defaults were more cautious and I used an auto-props files based on the wiki version. svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 The old repository's still around as svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 I renamed it so that people would work with it by mistake. If, for some hard-to-imagine reason, you have a working copy that you want to run against it, you should be able to do an svn switch --relocate on your working copy and be back in shape. In fact, it might be a good time to give it a try.... g. From hartzell at alerce.com Sat Jun 30 21:17:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 30 Jun 2007 18:17:18 -0700 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: <18055.158.30409.808612@almost.alerce.com> Chris Fields writes: > Checkout worked for me (Mac OS X) using both: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ > tags/release-0-9-2/t/data > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ > tags/release-0-9-2/ > > so removing the offending file worked (good catch!). Haven't run a > full co but probably isn't necessary. > [...] I'll keep a note of that as something to do when I prepare the final cut of the repository. g. From jason at bioperl.org Sat Jun 30 21:25:30 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 30 Jun 2007 18:25:30 -0700 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: <18054.63942.316904.413911@almost.alerce.com> References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: Thanks George - I also did chgrp -R bioperl /home/hartzell/bioperl_take? to make sure the group permission was set right. We may also want to do a chmod g+s on all the dirs in there as well so that permissions are preserved when this gets deployed for real. If anyone wants to make some changes to files and commit them, as well as make some branches/tags to play around a little bit since we'll likely throw this away and do it again from locked down version from CVS at some appointed time. Do you know how to have svn commit messages generate summary emails as well? -j On Jun 30, 2007, at 5:48 PM, George Hartzell wrote: > > There's a second cut at the subversion repository. I've done a better > job of setting svn:keywords and svn:eol-style on various files. The > defaults were more cautious and I used an auto-props files based on > the wiki version. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 > > The old repository's still around as > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 > > I renamed it so that people would work with it by mistake. If, for > some hard-to-imagine reason, you have a working copy that you want to > run against it, you should be able to do an svn switch --relocate on > your working copy and be back in shape. In fact, it might be a good > time to give it a try.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Sat Jun 30 22:21:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 22:21:25 -0400 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: <18054.63942.316904.413911@almost.alerce.com> References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: <5F53A433-BAA9-431D-A0C5-5955690D0B73@gmx.net> On Jun 30, 2007, at 8:48 PM, George Hartzell wrote: > I renamed it so that people would work with it by mistake. If, for > some hard-to-imagine reason, you have a working copy that you want to > run against it, It's not so hard to imagine - checking out the entire repository takes a long time. > you should be able to do an svn switch --relocate on > your working copy and be back in shape. In fact, it might be a good > time to give it a try.... It doesn't work: svn: The repository at 'svn+ssh://dev.open-bio.org/home/hartzell/ bioperl_take2' has uuid '31277767-6726-dc11-ab4c-0019e3f901d6', but the WC has '27e854f1-f323-dc11-8c1b-0019e3f901d6' You can't relocate to a totally new repository (relocating to bioperl_take1 does work though). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 30 22:39:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 21:39:27 -0500 Subject: [Bioperl-l] Take 2 of the new subversion repository. In-Reply-To: References: <18054.63942.316904.413911@almost.alerce.com> Message-ID: <7C6FD6C9-CBED-40D3-BA90-4B34F79E6DE0@uiuc.edu> There are a few CPAN modules available; here's one: http://search.cpan.org/~dwheeler/SVN-Notify-2.66/lib/SVN/Notify.pm chris On Jun 30, 2007, at 8:25 PM, Jason Stajich wrote: > Thanks George - > I also did > chgrp -R bioperl /home/hartzell/bioperl_take? > to make sure the group permission was set right. > > We may also want to do a chmod g+s on all the dirs in there as well > so that permissions are preserved when this gets deployed for real. > > If anyone wants to make some changes to files and commit them, as > well as make some branches/tags to play around a little bit since > we'll likely throw this away and do it again from locked down version > from CVS at some appointed time. > > Do you know how to have svn commit messages generate summary emails > as well? > > -j > On Jun 30, 2007, at 5:48 PM, George Hartzell wrote: > >> >> There's a second cut at the subversion repository. I've done a >> better >> job of setting svn:keywords and svn:eol-style on various files. The >> defaults were more cautious and I used an auto-props files based on >> the wiki version. >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2 >> >> The old repository's still around as >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1 >> >> I renamed it so that people would work with it by mistake. If, for >> some hard-to-imagine reason, you have a working copy that you want to >> run against it, you should be able to do an svn switch --relocate on >> your working copy and be back in shape. In fact, it might be a good >> time to give it a try.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sat Jun 30 22:46:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 21:46:05 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4686CC04.6000403@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> <4686CC04.6000403@sendu.me.uk> Message-ID: On Jun 30, 2007, at 4:32 PM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: >>> [...] >>> Very definitely the latter. The key benefit of my approach is >>> that the organisation stays as is and that a snapshot of the >>> repository remains a single directory of modules in Bio so that >>> people don't have to 'install' Bioperl, they can still just >>> uncompress the archive (or check out the package from svn) and >>> point their PERL5LIB to the root dir of the package. > [snip] >> In this sense, I understand a release pumpkin will generate ~900 >> packages to upload to CPAN? How much hassle is that compared to >> what uploading a bioperl release means right now? > > I'd have to investigate. I did my uploads using the PAUSE website, > which for 900 packages would be unfeasible. Will have to see if the > process can be automated. Not that they would care one way or another but maybe we should contact the CPAN maintainers to get their thoughts. They might have some ideas... >> How brittle is all the Build.PL code that will be needed to >> automate all of this, and how difficult will it be to maintain? >> For example, if someone adds in 10 new modules, what Build.PL- >> related work will need to be done? > > Well, my plan will be that once the work is done, you won't need to > touch the Build.PL code again. My intent is that the pumpkin can > just type one command and not think about anything. > > As for the reality, I won't know until I think about it properly > and experiment. A good experiment for a branch. I still think this could be accomplished step-wise; for instance run a quick test using something with a simple dependency tree like Bio::Root::Root (only needs RootI), finish up with Bio::Root*, then work down into PrimarySeq, Seq, etc. Submit them to CPAN piecemeal or in batches (all Bio::Seq*, so on). If the Build.PL, etc are to be generated on the fly then maybe there should be a simple way of registering or matching tests to modules (or vice versa) to ease the pain, particularly for new code. chris From hlapp at gmx.net Sat Jun 30 22:56:04 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 22:56:04 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> Message-ID: It turns out that both files are also present on the release-0-9-3, bioperl-1-0-0, bioperl-1-0-alpha, and bioperl-1-0-alpha2-rc tags, so add $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/release-0-9-3/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-0/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha/t/data/ HUMBETGLOA.fasta $ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha2-rc/t/data/ HUMBETGLOA.fasta to the post-processing commands. -hilmar On Jun 30, 2007, at 8:40 PM, Chris Fields wrote: > Checkout worked for me (Mac OS X) using both: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/ > > so removing the offending file worked (good catch!). Haven't run a > full co but probably isn't necessary. > > chris > > On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote: > >> >> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: >> >>> I just did the experiment, and filename-insensitivity seems to be >>> breaking something. >>> >>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. >>> >>> I reformatted a memory stick to be case sensitive and co of >>> >>> bioperl/bioperl-live/tags/release-0-9-2/t >>> >>> worked, then I made a directory in my home dir (normal mac thing) >>> and >>> got the same error as above. >> >> You picked up a rename of a file from lower case extension to >> upper case extension. Unfortunately, there are several months >> between adding the upper-case and removing the lower-case version. >> >> We can reconstruct what happened with this using svn log on the >> directory (this does not require a checkout): >> >> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ >> bioperl/bioperl-live/trunk/t/data >> >> Searching for HUMBETGLOA yields the following two commits that >> added one and removed the other: >> >> --------------------------------------------------------------------- >> --- >> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 >> lines >> Changed paths: >> M /bioperl-live/trunk/t/SearchIO.t >> A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA >> A /bioperl-live/trunk/t/data/cysprot1.FASTA >> >> added tests for FASTA >> >> --------------------------------------------------------------------- >> --- >> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 >> lines >> Changed paths: >> A /bioperl-live/trunk/t/data/HUMBETGLOA.fa >> D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta >> >> renaming file to avoid clobbering on windows >> >> Unfortunately, both files are in the tag (again, no checkout >> required): >> >> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ >> bioperl-live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i >> fasta >> HUMBETGLOA.FASTA >> HUMBETGLOA.fasta >> >> We can remove the offending version from the repository (again, >> without needing a checkout): >> >> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- >> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta >> >> I did this, and now the tag checks out fine on OSX. Can anyone >> confirm? >> >> (BTW the ability to operate on the repository w/o needing a >> checkout is another advantage of svn) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Fri Jun 1 08:06:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 01 Jun 2007 09:06:04 +0100 Subject: [Bioperl-l] ClustalW Score? In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><465E9B58.1020403@sendu.me.uk> <49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org> <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu> Message-ID: <465FD36C.5060603@sendu.me.uk> Kevin Brown wrote: >> you're right --- it is not really my code, I was just >> elaborating Kevin's example --- it would probably need to be >> more specific or perhaps the last Score seen is sufficient >> for what one is trying to capture? > > I took that code from a pairwise clustal alignment script that I wrote > to deal with aligning a bunch of short sequences against a long one to > see where they line up at. When all of them were fed to Clustal the > short sequences all ended up aligned to each other and not well aligned > to the longer sequence. I only saw one score in the output from the > pairwise, so that is what I used to find a reasonable value. Ok, well I've hedged my bets and used both. Now commited to CVS. From jy at genseq.co.uk Sat Jun 2 02:39:48 2007 From: jy at genseq.co.uk (Jean-Yves Sireau) Date: Sat, 2 Jun 2007 10:39:48 +0800 Subject: [Bioperl-l] Genseq Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com> Dear List members, I would like to let you know of the formation of Genseq Ltd., a bioinformatics company that will (in time!) offer genome sequencing to high net worth individuals and bioinformatic analysis of the sequence data to detect predisposition to illness. The company's website is www.genseq.co.uk Genseq would be willing to sponsor bioperl, whether financially or by providing resources, notably for any bioperl-related activities in the Asia Pacific region. Genseq's bioinformatics team will be based in Cyberjaya (Malaysia), and we are in particular interested to promote bioperl in Malaysia. We are also actively recruiting at the moment in Malaysia and India. If there was sufficient demand, we would be willing to organise a bioperl conference in Cyberjaya at the Cyberview Lodge (www.cyberview-lodge.com), which would be the ideal place for such a conference in Malaysia. Looking forward to your comments, suggestions and proposals. Best regards Jean-Yves Sireau -- Jean-Yves Sireau CEO, Genseq Ltd. www.genseq.co.uk From cjfields at uiuc.edu Sat Jun 2 05:16:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 00:16:05 -0500 Subject: [Bioperl-l] EUtilities overhaul started Message-ID: To anyone using Bio::DB::EUilities, I am in the midst of a major overhaul to the various EUtilities tools and to Bio::DB::GenericWebDBI (the latter which I am forming into more or less a test bed for other database interfaces). I'm about 80% done at this point, and will likely start committing changes this coming week. The overall interface will change (something I had warned about in the Bio::DB::EUtilities POD) but I am hoping it will be more intuitive and easier to use in the long run. I'll describe the overall redesign and use in an upcoming HOWTO (as recommended by Brian a while back). If anyone has any suggestions/ideas/flames, please let me know! Cheers! chris From cjfields at uiuc.edu Sat Jun 2 14:39:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 09:39:25 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: Yes, there are a few odd issues, though that's one I've not heard of yet. You might try one of the sub-nucleotide databases (nuccore, nucest, nucgss). I'll try looking into it and (if necessary) pester NCBI about it. I'll pass this on to the mail list to see if anyone else knows about the problem. chris On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: > Hi Chris, > > Thanks for your work on EUtilities. > For a production task, I used EUtilitities directly (given your > announced overhaul). I noticed a recent problem at NCBI (reported two > weeks ago to NCBI, no reply yet). Possibly you may run into this with > testing: if you ePOST gi ids to the EU server and then use this set in > Esearch (using the query key) no results are returned for the > nucleotide database. > ESearches like "db=$db%23$QueryKey" typically fail if the $db is > nucleotide (but work f $db='protein'). The XML output has Count 0 and > an empty QueryTranslationSet for db=nucleotide only. > For completeness, I attach a simple test script I used. > > > Best regards, > Bernd > > > On 6/2/07, Chris Fields wrote: >> To anyone using Bio::DB::EUilities, >> >> I am in the midst of a major overhaul to the various EUtilities tools >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> more or less a test bed for other database interfaces). I'm about >> 80% done at this point, and will likely start committing changes this >> coming week. >> >> The overall interface will change (something I had warned about in >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> intuitive and easier to use in the long run. I'll describe the >> overall redesign and use in an upcoming HOWTO (as recommended by >> Brian a while back). >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> Cheers! >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Jun 3 04:51:57 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 2 Jun 2007 23:51:57 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu> I can confirm this; however it only relates to the use of history with esearch and nucleotide (use of the history with other eutils seems to work fine); retrieving sequences via efetch is not affected. If I find out anything more I'll post something on the mail list. chris On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote: > I can confirm that using the correct sub-nucleotide database works > (nuccore in my case). > This seems to be a quite recent change/bug at NCBI. Until recently, > db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid > db. > It is not optimal to have to choose the sub-database and the searches > work via the Entrez web-interface. Note that this problem is related > to the ESearch and db=nucleotide. > > bernd > > On 6/2/07, Chris Fields wrote: >> Yes, there are a few odd issues, though that's one I've not heard of >> yet. You might try one of the sub-nucleotide databases (nuccore, >> nucest, nucgss). >> >> I'll try looking into it and (if necessary) pester NCBI about it. >> I'll pass this on to the mail list to see if anyone else knows about >> the problem. >> >> chris >> >> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote: >> >> > Hi Chris, >> > >> > Thanks for your work on EUtilities. >> > For a production task, I used EUtilitities directly (given your >> > announced overhaul). I noticed a recent problem at NCBI >> (reported two >> > weeks ago to NCBI, no reply yet). Possibly you may run into this >> with >> > testing: if you ePOST gi ids to the EU server and then use this >> set in >> > Esearch (using the query key) no results are returned for the >> > nucleotide database. >> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is >> > nucleotide (but work f $db='protein'). The XML output has Count >> 0 and >> > an empty QueryTranslationSet for db=nucleotide only. >> > For completeness, I attach a simple test script I used. >> > >> > >> > Best regards, >> > Bernd >> > >> > >> > On 6/2/07, Chris Fields wrote: >> >> To anyone using Bio::DB::EUilities, >> >> >> >> I am in the midst of a major overhaul to the various EUtilities >> tools >> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into >> >> more or less a test bed for other database interfaces). I'm about >> >> 80% done at this point, and will likely start committing >> changes this >> >> coming week. >> >> >> >> The overall interface will change (something I had warned about in >> >> the Bio::DB::EUtilities POD) but I am hoping it will be more >> >> intuitive and easier to use in the long run. I'll describe the >> >> overall redesign and use in an upcoming HOWTO (as recommended by >> >> Brian a while back). >> >> >> >> If anyone has any suggestions/ideas/flames, please let me know! >> >> >> >> Cheers! >> >> >> >> chris >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From basu at pharm.stonybrook.edu Sun Jun 3 14:44:18 2007 From: basu at pharm.stonybrook.edu (Siddhartha Basu) Date: Sun, 03 Jun 2007 10:44:18 -0400 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: On Sat, 2 Jun 2007 00:16:05 -0500 Chris Fields wrote: > To anyone using Bio::DB::EUilities, > > I am in the midst of a major overhaul to the various >EUtilities tools > and to Bio::DB::GenericWebDBI (the latter which I am >forming into > more or less a test bed for other database interfaces). > I'm about > 80% done at this point, and will likely start committing >changes this > coming week. > > The overall interface will change (something I had >warned about in > the Bio::DB::EUtilities POD) but I am hoping it will be >more > intuitive and easier to use in the long run. I'll >describe the > overall redesign and use in an upcoming HOWTO (as >recommended by > Brian a while back). Hi chris, Being a frequent user of EUtilities, hopefully this api facelift and upcoming howto will definitely be more helpful. Anyway, one thing i noticed that for each eutil call such as efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has to be instantiated. And thereafter it cannot be set during runtime such as $eutils->id('ids'), for example.... my $eutils = Bio::DB::Eutilities->new ( -id => $id, -eutil => 'esummary', -db => 'protein', ); my $ct = $eutils->get_response->content(); ## -- now i cannot do this... $eutils->id($newid); my $ct = $eutils->get_response->content(); Is the new api going to address something along this line or is there currently anyway to reuse the object. Thanks again for this nice toolkit. -siddhartha > > If anyone has any suggestions/ideas/flames, please let >me know! > > Cheers! > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Jun 3 23:52:39 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 3 Jun 2007 18:52:39 -0500 Subject: [Bioperl-l] EUtilities overhaul started In-Reply-To: References: Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu> On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote: > ... > Hi chris, > Being a frequent user of EUtilities, hopefully this api facelift > and upcoming howto will definitely be more helpful. > Anyway, one thing i noticed that for each eutil call such as > efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has > to be > instantiated. And thereafter it cannot be set during runtime such as > $eutils->id('ids'), for example.... > > my $eutils = Bio::DB::Eutilities->new ( -id => $id, > -eutil => 'esummary', > -db => 'protein', > ); > my $ct = $eutils->get_response->content(); > > ## -- now i cannot do this... > $eutils->id($newid); > my $ct = $eutils->get_response->content(); I'll have to check up on that, though changing id() should work with the old API. It won't matter with the new API (it works fine), but it is still troubling... > Is the new api going to address something along this line or is > there currently anyway to reuse > the object. > Thanks again for this nice toolkit. > > -siddhartha The old API was based upon the idea of creating discrete user agents for each eutil to retrieve data. The problem with the old interface is it attempts to do too much (take care of parameters, set up requests, retrieve responses, parse data, etc), and many tasks required instantiating a new EUtilities object. I was never really satisfied with it. The new interface is a composition of three classes: the web user agent (LWP::UserAgent), a class encapsulating parameter handling, and a parser class (all which can be used independently if needed). When parameters change a new request is made 'lazily' (i.e. only when needed). Similarly, when data is requested after any parameter change a new parser instance is created and the new response is parsed. With that in mind you can now do the following: ---------------------------------------- my @params = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA1', -retmax => 100); my $eutil = Bio::DB::EUtilities->new(@params); # no need to get response first; get_ids() calls that if needed my @ids = $eutil->get_ids; # below changes only those parameters, leaves all others set as before $eutil->set_parameters(-eutil => 'efetch', -id => \@ids, -retmode => 'text', -rettype => 'fasta'); # sends streamed content directly to a file $eutil->get_response(-content_file => 'seqs.fas'); # or to a LWP::UserAgent-supported request callback $eutil->get_response(-content_cb => \&my_cb); my @newparams = (-eutil => 'esearch', -db => 'protein', -term => 'BRCA2', -retmax => 100); # Resets eutility to passed parameters (or undef) $eutil->reset_parameters(@newparams); # retrieve new IDs my @new_ids = $eutil->get_ids; ---------------------------------------- Note the same eutil object is used for all of the above, so to answer your last question, yes, you should be able to create data pipelines using the same object if necessary. chris From sac at bioperl.org Mon Jun 4 17:56:57 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 4 Jun 2007 10:56:57 -0700 Subject: [Bioperl-l] question about Bio::Restriction::Analysis In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu> Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com> Hi Apurva, I'm cc:ing the list to let others know you have found performance issues with Bio::Restriction::Analysis. Ideally, we should focus on addressing those issues rather than fixing a module that is now deprecated. But taking a quick look at my Bio::Tools::RestrictionEnzyme module, I'm not sure why HpaII would give slower performance relative to other non-ambiguous cutters. This enzyme has a 4-base recognition sequence CCGG, and if you're feeding it a large CG-rich input sequence, that could be a factor. To test, you might try using some other 4-base cutters that aren't CG-rich (TaqI, TasI) or try some other input sequences. There is no special flag to indicate that the enzyme is non-ambiguous. The module handles that automatically. Good luck, Steve On 6/4/07, Apurva Narechania wrote: > Hi Rob and Steve, > > I was hoping you could answer a quick performance question regarding > the Bio::Restriction::Analysis module. I have found that though this > module works well, it is considerably slower than the deprecated > Bio::Tools::RestrictionEnzyme. I see that there are two algorithms > available to your module, and since I am using HpaII, a non-ambiguous > enzyme, I thought I might find similar performance to the older, > deprecated module, but I do not. Is it possible that I am not setting > the non-ambiguous flag correctly? Does it need to be set in the first > place? > > As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have > found instances where it is inaccurate, especially in calculating > fragments of extremely small size 1-5 base pairs, so I would like to > use your module if possible. It just seems slow to me. > > Can you clarify? > > I have copied my code below since it is a short, simple script. > > Thanks! > Apurva Narechania > Ware Lab > Cold Spring Harbor Labs > > ---------- > > #!/usr/bin/perl > > # This program generates a fasta of restriction frags given an > # input fasta and a restriction cut site > > use Getopt::Std; > use Bio::Seq; > use Bio::SeqIO; > use strict; > > use Bio::Tools::RestrictionEnzyme; > > my %opts = (); > getopts ('f:', \%opts); > my $fasta = $opts{'f'}; > > # read fasta file > my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta"); > > my $x = 0; > while (my $sequence_obj = $seqin -> next_seq()){ > $x++; > my $id = $sequence_obj->id(); > > print STDERR "$x Working on $id\n"; > > # generate the rx object > my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII'); > > my @frags = $ra->cut_seq($sequence_obj); > > my $counter = 0; > foreach my $frag (@frags){ > $counter++; > my $length = length ($frag); > print ">$id.$counter length=$length\n$frag\n"; > } > > } > > From anhthu.tieu at gsf.de Tue Jun 5 08:14:09 2007 From: anhthu.tieu at gsf.de (Tieu, Anh-Thu) Date: Tue, 5 Jun 2007 10:14:09 +0200 Subject: [Bioperl-l] problems with image maps and IE 6 or higher Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de> Hi, I have a problem using the bioperl image maps function with the IE6 or and higher browser. It might be a more general problem with IE6 rather than with bioperl, but as I used bioperl to create my image maps, I thought I could still post this problem here and ask for people's opinion. I wondered if anyone else faced the same problem and if possible if anyone could share their experiences and their solutions.

scale alignment5 integration_pt gene intron1 usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/>

> > onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " > alt="scale " target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="alignment5 " alt="alignment5 " > target="_blank"/> > onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', > 'start: ', '', 'stop: 0', '', 'length: bp', '', 'identity: ', '', 'e-v > alue: ' ));;return false;" title="integration_pt " alt="integration_pt " > target="_blank"/> > onclick="javascript:void(zmenu( 'Nphs1 ', > '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', ' > stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " > alt="gene " target="_blank"/> > onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: > 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a > lt="exon1 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: > 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1 > " alt="intron1 " target="_blank"/> > onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: > 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a > lt="exon2 " target="_blank"/> > onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: > 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2 > .. >
> > > This is part of the code I used in my HTML file to display the image map > and it really runs beautifully > with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 > the clickable pop-ups do not appear/ work. > > I appreciate any help and would like to thank everyone for their help. > > Best regards, > > > Anh-Thu > ________________________________________________________________________ > GSF-Forschungszentrum > > Ingolst?dter Landstr. 1 > > 85764 M?nchen-Neuherberg, Germany > > Chairman of Supervisory Board: MinDir Dr. Peter Lange > > Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum > > Register of Societies: Amtsgericht M?nchen HRB 6466 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Tue Jun 5 15:28:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 10:28:24 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Martin, The example file you give in the bioperl bugzilla report has several blank annotation lines which may lead to additional problems. When the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, DEFINITION, etc) then it expects there will also be relevant data (text descriptions) accompanying it; I assume the BioPython parser expects likewise though I may be wrong. AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- compliant. GenBank records lacking text either have a '.' instead or are left out entirely: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html We could add a fix but you should probably contact the ApE developers and request that field names w/o text be left out or have '.' added. chris On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > Ezequiel Panepucci wrote: >>> genbank entry = parser.parse(fhandle) >> >> there is a space character between "genbank" and "entry". >> It is a syntax error. >> I suppose you meant "genbank_entry" ? > > Yes, the next command was right and has shown the error. Sorry, I > forgot > to delete the first attempt. ;-) > >>>> genbank_entry = parser.parse(fhandle) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", > line 187, in parse > self._scanner.feed(handle, self._consumer) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 360, in feed > self._feed_first_line(consumer, self.line) > File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", > line 835, in _feed_first_line > assert False, \ > AssertionError: Did not recognise the LOCUS line layout: > LOCUS 6499 bp ds-DNA linear 02-AUG-2006 > >>>> > > Martin > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From stewarta at nmrc.navy.mil Tue Jun 5 15:34:14 2007 From: stewarta at nmrc.navy.mil (Andrew Stewart) Date: Tue, 5 Jun 2007 11:34:14 -0400 Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil> I see bidirectional mutator methods for source, type, strand, etc. in the Bio::DB::GFF::Feature documentation but I see that ->attributes is only able to get and not set the feature attributes. Is there no way to modify the attributes of a Bio::DB::GFF::Feature live? -- Andrew Stewart Research Assistant, Genomics Team Navy Medical Research Center (NMRC) Biological Defense Research Directorate (BDRD) BDRD Annex 12300 Washington Avenue, 2nd Floor Rockville, MD 20852 email: stewarta at nmrc.navy.mil phone: 301-231-6700 Ext 270 From cjfields at uiuc.edu Tue Jun 5 16:07:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 5 Jun 2007 11:07:41 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: One thing I missed which explains the biopython error: the LOCUS line is missing the locus identifier (see the NCBI example record link). This doesn't choke the bioperl parser but it appears to stop the biopython parser in it's tracks (maybe a feature instead of a bug!). You should try adding a unique identifier (maybe the name of the file or record) to the LOCUS line to see if it works: LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 The bioperl parser in CVS writes out the correct alphabet when this is added: LOCUS testfile 6499 bp ds-DNA linear 02- AUG-2006 I'll try adding a warning to the bioperl parser for this. chris On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > Martin, > > The example file you give in the bioperl bugzilla report has several > blank annotation lines which may lead to additional problems. When > the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, > DEFINITION, etc) then it expects there will also be relevant data > (text descriptions) accompanying it; I assume the BioPython parser > expects likewise though I may be wrong. > > AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- > compliant. GenBank records lacking text either have a '.' instead or > are left out entirely: > > http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html > > We could add a fix but you should probably contact the ApE developers > and request that field names w/o text be left out or have '.' added. > > chris > > On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: > >> Ezequiel Panepucci wrote: >>>> genbank entry = parser.parse(fhandle) >>> >>> there is a space character between "genbank" and "entry". >>> It is a syntax error. >>> I suppose you meant "genbank_entry" ? >> >> Yes, the next command was right and has shown the error. Sorry, I >> forgot >> to delete the first attempt. ;-) >> >>>>> genbank_entry = parser.parse(fhandle) >> Traceback (most recent call last): >> File "", line 1, in ? >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >> line 187, in parse >> self._scanner.feed(handle, self._consumer) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 360, in feed >> self._feed_first_line(consumer, self.line) >> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >> line 835, in _feed_first_line >> assert False, \ >> AssertionError: Did not recognise the LOCUS line layout: >> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >> >>>>> >> >> Martin >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Wed Jun 6 02:00:34 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Tue, 05 Jun 2007 22:00:34 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: I am wondering if I knew what this error message exactly meant, if I could discern my error. I don't see much difference in this program and programs that worked. Can I assume that the new worked because an index file exists? I don't know how the filehandle UTR_TT_GENES gets involved. Maybe I should use some other module, but I really would like to have get_Seq_by_id functionality. The error message: Dpse ortholog = Dpse_GA17307 fetching GA17307 Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, line 4. Relevant code: #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; # my $db = Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol ogs_Dpse_genes.fa', -makeid => \&make_my_id); ... ... ... my $pse_obj = $db->get_Seq_by_id('GA17307'); my $pse_sequence = $pse_obj->seq; Nick Staffa Telephone: 919-316-4569 (NIEHS: 6-4569) Scientific Computing Support Group NIEHS Information Technology Support Services Contract (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina From jason at bioperl.org Wed Jun 6 03:12:40 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 5 Jun 2007 20:12:40 -0700 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: the file handle is probably not important, Perl just reports this if there is a filehandle open. more importantly what is on line 84.... my guess is you are trying to get a sequence out and it doesn't exist - some error code around the lines getting the sequence out would be helpful. On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote: > I am wondering if I knew what this error message exactly meant, if > I could > discern my error. > I don't see much difference in this program and programs that worked. > Can I assume that the new worked because an index file exists? > I don't know how the filehandle UTR_TT_GENES gets involved. > Maybe I should use some other module, but I really would like to have > get_Seq_by_id functionality. > > The error message: > Dpse ortholog = Dpse_GA17307 > fetching GA17307 > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl > line 84, > line 4. > > Relevant code: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > # > my $db = > Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ > TT_orthol > ogs_Dpse_genes.fa', > -makeid => \&make_my_id); > ... > ... > ... > my $pse_obj = $db->get_Seq_by_id('GA17307'); > my $pse_sequence = $pse_obj->seq; > > > > > Nick Staffa > Telephone: 919-316-4569 (NIEHS: 6-4569) > Scientific Computing Support Group > NIEHS Information Technology Support Services Contract > (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov) > National Institute of Environmental Health Sciences > National Institutes of Health > Research Triangle Park, North Carolina > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2613 bytes Desc: not available URL: From torsten.seemann at infotech.monash.edu.au Wed Jun 6 06:06:37 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 6 Jun 2007 16:06:37 +1000 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: References: Message-ID: Nick, > Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, The error makes it pretty clear. You are calling the ->seq method on an undefined value, ie. $pse_obj. > my $pse_obj = $db->get_Seq_by_id('GA17307'); # check we got something! die "sequence not in database" unless $pse_obj; > my $pse_sequence = $pse_obj->seq; -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From shameer at ncbs.res.in Wed Jun 6 06:27:42 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST) Subject: [Bioperl-l] Validation of files using BioPerl Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Dear All, How to validate an input file in fasta/PIR/GenPept/PDB format using Bioperl ? (This is to avoid unnecessary files to be submitted to servers by new users). Any module available ? Many thanks in advance, -- Shameer Khadar From cjfields at uiuc.edu Wed Jun 6 12:37:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 07:37:28 -0500 Subject: [Bioperl-l] Validation of files using BioPerl In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in> Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu> It has been discussed but never coded. I believe if it passes through the Bio::SeqIO parser it's generally considered validly formatted (spacing, balanced quotes), though it doesn't specifically check FT keys and qualifiers for invalid ones, look for missing annotation, check taxonomy, etc. As long as the end sequence mark (//) is present for every file, you cold try parsing the file into chunks (read with 'local $/ = '//';') and tossing the seq chunks as a filehandle (via IO::String) to a Bio::SeqIO object wrapped in an eval block (the parser resets $/, so it should work). Follow the eval with a check of $@ for caught errors. It might get tedious for big sequences... chris On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote: > Dear All, > > How to validate an input file in fasta/PIR/GenPept/PDB format using > Bioperl ? (This is to avoid unnecessary files to be submitted to > servers > by new users). Any module available ? > > Many thanks in advance, > -- > Shameer Khadar > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From staffa at niehs.nih.gov Wed Jun 6 14:40:49 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Wed, 06 Jun 2007 10:40:49 -0400 Subject: [Bioperl-l] Bio::DB::Fasta In-Reply-To: Message-ID: Indeed. One must know what is actually in his header, AND one must write the appropriate make_id subroutine AND one must specify the exact ID. THEN things might work. And they did! THANK YOU On 6/6/07 2:06 AM, "Torsten Seemann" wrote: > Nick, > >> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84, > > The error makes it pretty clear. You are calling the ->seq method on > an undefined value, ie. $pse_obj. > >> my $pse_obj = $db->get_Seq_by_id('GA17307'); > > # check we got something! > die "sequence not in database" unless $pse_obj; > >> my $pse_sequence = $pse_obj->seq; > From jaudall at gmail.com Wed Jun 6 21:51:33 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:51:33 -0600 Subject: [Bioperl-l] blastxml interation Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being possibly useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number. Thanks in advance for any suggestions. Josh From dmessina at wustl.edu Wed Jun 6 22:18:26 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 6 Jun 2007 17:18:26 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: I think you want to look at the hits(), num_hits() and no_hits_found () methods. There is a private method _next_iteration_index() which should do what you asked for, but num_hits() looks like the better way. By the way, hits() and num_hits() are listed on the Deobfuscator as having no documentation. This (as the below shows) is incorrect and is due to some nonstandard formatting issues which I will correct. _next_iteration_index() isn't listed on the Deobfuscator because it's a private method. Hope this helps! Dave hits() This method overrides Bio::Search::Result::GenericResult::hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, all 'new' hits for all iterations are returned. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::hits num_hits() This method overrides Bio::Search::Result::GenericResult::num_hits to take into account the possibility of multiple iterations, as occurs in PSI- BLAST reports. If there are multiple iterations, calling num_hits() returns the number of 'new' hits for each iteration. These are the hits that did not occur in a previous iteration. See Also: Bio::Search::Result::GenericResult::num_hits no_hits_found() Usage : $nohits = $blast->no_hits_found( $iteration_number ); Purpose : Get boolean indicator indicating whether or not any hits were present in the report. This is NOT the same as determining the number of hits via the hits() method, which will return zero hits if there were no hits in the report or if all hits were filtered out during the parse. Thus, this method can be used to distinguish these possibilities for hitless reports generated when filtering. Returns : Boolean Argument : (optional) integer indicating the iteration number (PSI- BLAST) If iteration number is not specified and this is a PSI- BLAST result, then this method will return true only if all iterations had no hits found. From apurva at cshl.edu Wed Jun 6 23:51:45 2007 From: apurva at cshl.edu (Apurva Narechania) Date: Wed, 6 Jun 2007 19:51:45 -0400 Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu> Hi, I was hoping you could confirm and give me some feedback on an issue I think I've found with the Bio::Restriction::Analysis module. I am using the enzyme AciI, a non-palindromic restriction enzyme with a 5' C | CGC 3' recognition site. The module should search both the forward and the reverse complement strings in the case of a non- palindromic enzyme. I have found that the this works only intermittently. For example, the following sequence: GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG CGCGGTTG GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG GCTGGTAT TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC AGGACACC GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA CAAAGTGA CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG CAATGTAT ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA TAATGCTA GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC Should digest into 10 fragments using this enzyme, but the module produces only 7. Could you please confirm this behavior, and if observed, suggest some possible fixes? This may be a bug in the _non_pal_enz method, or may be me overlooking something pretty obvious. Thanks, Apurva Narechania. From cjfields at uiuc.edu Thu Jun 7 00:51:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 6 Jun 2007 19:51:00 -0500 Subject: [Bioperl-l] blastxml interation In-Reply-To: References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com> Message-ID: Joshua, Just to make sure there is no confusion, do you mean a Bio::Search::Iteration::IterationI-based object? The iteration tags have multiple meanings apparently in BLAST XML output (multiple queries, multiple PSI-BLAST iterations). The current SearchIO::blastxml parser returns multiple Bio::Search::Result::BlastResult objects based on the iterations, so PSI-BLAST output is treated as multiple BLAST reports regardless (i.e. no Iteration objects). This is something I want to rectify but it may not be a easy fix. chris On Jun 6, 2007, at 5:18 PM, David Messina wrote: > I think you want to look at the hits(), num_hits() and no_hits_found > () methods. There is a private method _next_iteration_index() which > should do what you asked for, but num_hits() looks like the better > way. > > By the way, hits() and num_hits() are listed on the Deobfuscator as > having no documentation. This (as the below shows) is incorrect and > is due to some nonstandard formatting issues which I will correct. > _next_iteration_index() isn't listed on the Deobfuscator because it's > a private method. > > > Hope this helps! > Dave > > > hits() > > This method overrides Bio::Search::Result::GenericResult::hits to take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, all 'new' hits for all iterations > are returned. > These are the hits that did not occur in a previous iteration. > See Also: Bio::Search::Result::GenericResult::hits > > num_hits() > > This method overrides Bio::Search::Result::GenericResult::num_hits to > take > into account the possibility of multiple iterations, as occurs in PSI- > BLAST reports. > If there are multiple iterations, calling num_hits() returns the > number of > 'new' hits for each iteration. These are the hits that did not occur > in a previous iteration. > See Also: Bio::Search::Result::GenericResult::num_hits > > no_hits_found() > > Usage : $nohits = $blast->no_hits_found( $iteration_number ); > Purpose : Get boolean indicator indicating whether or not any hits > were present in the report. > This is NOT the same as determining the number of > hits via > the hits() method, which will return zero hits if there > were no > hits in the report or if all hits were filtered out > during the parse. > > Thus, this method can be used to distinguish these > possibilities > for hitless reports generated when filtering. > > Returns : Boolean > Argument : (optional) integer indicating the iteration number (PSI- > BLAST) > If iteration number is not specified and this is a PSI- > BLAST result, > then this method will return true only if all > iterations had > no hits found. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Jun 7 00:45:14 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 6 Jun 2007 20:45:14 -0400 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db Message-ID: I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. A schema in PostgreSQL is more or less a namespace for database objects (tables, indexes, views, etc) within a database. (A database in PostgreSQL is similar to the concept of a user in Oracle or MySQL, and therefore for the latter two schemas are synonymous with a user. [Not sure I'm still up-to-date on this for MySQL, but at least that's what I recall.]) When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you specify the schema in which BioSQL resides using the --schema option. If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call also accepts a -schema named parameter, and Bio::DB::DBContextI objects have a $dbc->schema() property for getting/setting the schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may also add the property to the .bioperldb connection parameter file (-schema => 'yourschemahere'). Thanks for Brian Osborne for being the instigator (and tester, and for adding the code to load_ncbi_taxonomy.pl - I came too late). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jaudall at gmail.com Wed Jun 6 21:41:08 2007 From: jaudall at gmail.com (Joshua Udall) Date: Wed, 6 Jun 2007 15:41:08 -0600 Subject: [Bioperl-l] blastxml interation number Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com> I was searching in the deobfuscator under *Bio::Search::Result::BlastResult*but there doesn't seem to be a method to extract the iteration number from a blastxml report. I can see this number being very useful to count the number of queries that didn't hit anything since the are no empty reports in the blastxml output. If I'm missing something, I would welcome an example how to retrieve the result iteration number, otherwise I'm suggesting that an iteration_count feature be added to the Result object. Thanks in advance for any suggestions. Josh From holland at ebi.ac.uk Thu Jun 7 07:33:25 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Thu, 07 Jun 2007 08:33:25 +0100 Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db In-Reply-To: References: Message-ID: <4667B4C5.6070107@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sounds great. BioJava users shouldn't need to change anything to get this to work as PostgreSQL JDBC connection objects already require you to specify a schema. cheers, Richard Hilmar Lapp wrote: > I have added support to BioSQL and bioperl-db for schemas in PostgreSQL. > A schema in PostgreSQL is more or less a namespace for database objects > (tables, indexes, views, etc) within a database. > > (A database in PostgreSQL is similar to the concept of a user in Oracle > or MySQL, and therefore for the latter two schemas are synonymous with a > user. [Not sure I'm still up-to-date on this for MySQL, but at least > that's what I recall.]) > > When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you > specify the schema in which BioSQL resides using the --schema option. > > If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call > also accepts a -schema named parameter, and Bio::DB::DBContextI objects > have a $dbc->schema() property for getting/setting the schema, > Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may > also add the property to the .bioperldb connection parameter file > (-schema => 'yourschemahere'). > > Thanks for Brian Osborne for being the instigator (and tester, and for > adding the code to load_ncbi_taxonomy.pl - I came too late). > > -hilmar > --=========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij W/+0iO/ZsNDn1pLuf5yXbYA= =asUn -----END PGP SIGNATURE----- From mmokrejs at ribosome.natur.cuni.cz Thu Jun 7 14:26:44 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 07 Jun 2007 16:26:44 +0200 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz> Hi, Chris Fields wrote: > One thing I missed which explains the biopython error: the LOCUS line is > missing the locus identifier (see the NCBI example record link). This > doesn't choke the bioperl parser but it appears to stop the biopython > parser in it's tracks (maybe a feature instead of a bug!). > > You should try adding a unique identifier (maybe the name of the file or > record) to the LOCUS line to see if it works: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > The bioperl parser in CVS writes out the correct alphabet when this is > added: > > LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 > > I'll try adding a warning to the bioperl parser for this. I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me emphasize the LOCUS line now contains LOCUS pRL 5428 bp ds-DNA linear 07-JUN-2007 which still does not comply with the line you have proposed. But it can be parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new in the bugzilla record #2305. Martin > > chris > > On Jun 5, 2007, at 10:28 AM, Chris Fields wrote: > >> Martin, >> >> The example file you give in the bioperl bugzilla report has several >> blank annotation lines which may lead to additional problems. When >> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM, >> DEFINITION, etc) then it expects there will also be relevant data >> (text descriptions) accompanying it; I assume the BioPython parser >> expects likewise though I may be wrong. >> >> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- >> compliant. GenBank records lacking text either have a '.' instead or >> are left out entirely: >> >> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html >> >> We could add a fix but you should probably contact the ApE developers >> and request that field names w/o text be left out or have '.' added. >> >> chris >> >> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote: >> >>> Ezequiel Panepucci wrote: >>>>> genbank entry = parser.parse(fhandle) >>>> >>>> there is a space character between "genbank" and "entry". >>>> It is a syntax error. >>>> I suppose you meant "genbank_entry" ? >>> >>> Yes, the next command was right and has shown the error. Sorry, I >>> forgot >>> to delete the first attempt. ;-) >>> >>>>>> genbank_entry = parser.parse(fhandle) >>> Traceback (most recent call last): >>> File "", line 1, in ? >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", >>> line 187, in parse >>> self._scanner.feed(handle, self._consumer) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 360, in feed >>> self._feed_first_line(consumer, self.line) >>> File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", >>> line 835, in _feed_first_line >>> assert False, \ >>> AssertionError: Did not recognise the LOCUS line layout: >>> LOCUS 6499 bp ds-DNA linear 02-AUG-2006 >>> >>>>>> >>> >>> Martin >>> _______________________________________________ >>> BioPython mailing list - BioPython at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biopython >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From cjfields at uiuc.edu Thu Jun 7 15:31:45 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 7 Jun 2007 10:31:45 -0500 Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz> References: <46655550.70400@ribosome.natur.cuni.cz> <46656D64.7010508@ribosome.natur.cuni.cz> <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu> <466815A4.9060505@ribosome.natur.cuni.cz> Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu> On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote: > Hi, > > Chris Fields wrote: >> One thing I missed which explains the biopython error: the LOCUS >> line is missing the locus identifier (see the NCBI example record >> link). This doesn't choke the bioperl parser but it appears to >> stop the biopython parser in it's tracks (maybe a feature instead >> of a bug!). >> You should try adding a unique identifier (maybe the name of the >> file or record) to the LOCUS line to see if it works: >> LOCUS testfile 6499 bp ds-DNA linear 02-AUG-2006 >> The bioperl parser in CVS writes out the correct alphabet when >> this is added: >> LOCUS testfile 6499 bp ds-DNA linear 02- >> AUG-2006 >> I'll try adding a warning to the bioperl parser for this. > > I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 > but let me > emphasize the LOCUS line now contains > LOCUS pRL 5428 bp ds-DNA linear > 07-JUN-2007 > > > which still does not comply with the line you have proposed. But it > can be > parsed by bioperl-live from cvs. Is it still wrong? Testcase as > pRL.gb-new > in the bugzilla record #2305. > > Martin That should work. There isn't a strict uniqueness test (that would require caching and isn't worth the trouble IMHO), though it's required you add something unique for the accession/locus if you plan on indexing them in the future. Parsing GenBank data produced from third-party software is problematic at best; there seems to be no steadfast rule with GenBank output for some programs, even though the specification is plainly stated in the NCBI release notes. My take on that is to have a stricter (read:follows release notes) GenBank parser which passes off the data in the record to default handler methods. A user could then subjugate the defined handlers with their own by subclassing the default handler class and overloading the methods or adding their own code references directly. chris ... From rich at thevillas.eclipse.co.uk Fri Jun 8 11:00:45 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 12:00:45 +0100 Subject: [Bioperl-l] protparam Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk> Hi, I noticed that in April someone asked whether there was a bioperl mod for obtaining protein sequence related properties using protparam. I have a module that could potentially be submitted to bioperl for this purpose. Does anybody have any thoughts on whether it should go in? Example script and the module are at: http://81.5.159.173/webshare/ Cheers Rich From cjfields at uiuc.edu Fri Jun 8 12:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 07:37:27 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Richard, We'll gladly add this in, though it'll need to be bioperlized (inherit Bio::Root::Root). We also generally ask for tests but it should be easy to write up a quick test suite using any protein seq. If you can could you add some bioperl-like POD to the module (i.e. SYNOPSIS, AUTHOR, DESCRIPTION, etc)? thanks! chris On Jun 8, 2007, at 6:00 AM, richard wrote: > > Hi, > > I noticed that in April someone asked whether there was a bioperl mod > for obtaining protein sequence related properties using protparam. > I have a module that could potentially be submitted to bioperl for > this > purpose. Does anybody have any thoughts on whether it should go in? > > Example script and the module are at: > > http://81.5.159.173/webshare/ > > > Cheers > Rich > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From mmokrejs at ribosome.natur.cuni.cz Fri Jun 8 11:09:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 08 Jun 2007 13:09:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz> Hi, how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for Bio::Graphics::FeatureFile does not help me in this way. The information is in the file, so I want just to extract the features to a GFF format, probably somewhere the sequence has to be stored ... Is there a tool so I can convert it automatically? ;) This would be great. I can't make the GFF manually for every file. Other programs draw plasmid maps also automatically from the GenBank formatted input so how can I do it in bioperl? Thanks for help, Martin From shameer at ncbs.res.in Fri Jun 8 14:11:00 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST) Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in> Richard, I asked for protparam module in bioperl ! Thats a good job. Cheers, SK > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > >> >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From dmessina at wustl.edu Fri Jun 8 14:58:20 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 8 Jun 2007 09:58:20 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Hi Martin, You're in luck -- the BioPerl core distribution includes two scripts for doing just that: genbank2gff genbank2gff3 Look in the scripts directory of the distro. Also, there is a *huge* amount of documentation and examples on the BioPerl website. http://www.bioperl.org/wiki/HOWTOs Reading those, reading the FAQ, and searching the mailing list archives are where I look first when I don't know how to do something in BioPerl. Dave -- Dave Messina Senior Analyst, Assembly Group Genome Sequencing Center Washington University St. Louis, MO From rich at thevillas.eclipse.co.uk Fri Jun 8 15:51:21 2007 From: rich at thevillas.eclipse.co.uk (richard) Date: Fri, 08 Jun 2007 16:51:21 +0100 Subject: [Bioperl-l] protparam In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk> Hi, ok, great, that's no problem. I'll add the POD and bioperlize it, thanks Rich Chris Fields wrote: > Richard, > > We'll gladly add this in, though it'll need to be bioperlized > (inherit Bio::Root::Root). We also generally ask for tests but it > should be easy to write up a quick test suite using any protein seq. > > If you can could you add some bioperl-like POD to the module (i.e. > SYNOPSIS, AUTHOR, DESCRIPTION, etc)? > > thanks! > > chris > > On Jun 8, 2007, at 6:00 AM, richard wrote: > > >> Hi, >> >> I noticed that in April someone asked whether there was a bioperl mod >> for obtaining protein sequence related properties using protparam. >> I have a module that could potentially be submitted to bioperl for >> this >> purpose. Does anybody have any thoughts on whether it should go in? >> >> Example script and the module are at: >> >> http://81.5.159.173/webshare/ >> >> >> Cheers >> Rich >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Fri Jun 8 17:45:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 8 Jun 2007 12:45:17 -0500 Subject: [Bioperl-l] protparam In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk> References: <466936DD.8080604@thevillas.eclipse.co.uk> <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu> <46697AF9.2090502@thevillas.eclipse.co.uk> Message-ID: Another issue is namespace. I suggest Bio::Tools::ProtParam, though there may be some others out there. We can add support for direct Bio::Seq/PrimarySeq input and other odds and ends once it's committed. Good work! chris On Jun 8, 2007, at 10:51 AM, richard wrote: > > Hi, > > ok, great, that's no problem. I'll add the POD and bioperlize it, > > thanks > Rich > > Chris Fields wrote: >> Richard, >> >> We'll gladly add this in, though it'll need to be bioperlized >> (inherit Bio::Root::Root). We also generally ask for tests but it >> should be easy to write up a quick test suite using any protein seq. >> >> If you can could you add some bioperl-like POD to the module (i.e. >> SYNOPSIS, AUTHOR, DESCRIPTION, etc)? >> >> thanks! >> >> chris >> >> On Jun 8, 2007, at 6:00 AM, richard wrote: >> >> >>> Hi, >>> >>> I noticed that in April someone asked whether there was a bioperl >>> mod >>> for obtaining protein sequence related properties using protparam. >>> I have a module that could potentially be submitted to bioperl for >>> this >>> purpose. Does anybody have any thoughts on whether it should go in? >>> >>> Example script and the module are at: >>> >>> http://81.5.159.173/webshare/ >>> >>> >>> Cheers >>> Rich >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 11 11:30:24 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 11 Jun 2007 07:30:24 -0400 Subject: [Bioperl-l] script to load ITIS taxonomy Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Hi all - I added a script to load the ITIS taxonomy (www.itis.gov) into the phylodb module. It is called load_itis_taxonomy.pl and is in the scripts/ directory. It is independent of BioPerl right now (the ITIS download is either a MS SQL Server or an Informix dump - no kidding), but I'm hoping that at some point support for this can be integrated into Bio::TreeIO. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 11 12:24:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 11 Jun 2007 07:24:50 -0500 Subject: [Bioperl-l] script to load ITIS taxonomy In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net> Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu> On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote: > Hi all - > > I added a script to load the ITIS taxonomy (www.itis.gov) into the > phylodb module. It is called load_itis_taxonomy.pl and is in the > scripts/ directory. > > It is independent of BioPerl right now (the ITIS download is either a > MS SQL Server or an Informix dump - no kidding), but I'm hoping that > at some point support for this can be integrated into Bio::TreeIO. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== I second the TreeIO support. Anyone up for it? chris From ryanx07 at hotmail.com Mon Jun 11 15:24:31 2007 From: ryanx07 at hotmail.com (L Xu) Date: Mon, 11 Jun 2007 10:24:31 -0500 Subject: [Bioperl-l] basic questions Message-ID: I just started to learn BioPerl by reading the BioPerl Tutorial on the BioPerl website. By trying the 1st example on my window, use Bio::Perl; $seq_object = get_sequence('swiss',"ID ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); I got the error as the following: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: t8.pl:7 I cannot figure out where is wrong but cannot find the solution on the web. Could someone help me please? Also, this lead to my 2nd question: is there a way to search in the archieve of the current list? Thanks so much R ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Like puzzles? Play free games & earn great prizes. Play Clink now. http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2 From dmessina at wustl.edu Mon Jun 11 16:34:29 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 11:34:29 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu> The example code works here, but I'm on OS X. Could you tell us which version of Perl and BioPerl you are using, and which operating system? Are you getting anything in the roa1.fasta file? > is there a way to search in the archieve of the current list? http://www.bioperl.org/wiki/Mailing_lists Dave From dmessina at wustl.edu Mon Jun 11 18:48:23 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 11 Jun 2007 13:48:23 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Hi, Please use 'Reply All' so everyone on the list can follow the discussion. Try adding the following line after the line that starts with $seq_object: print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; And then run the program again. What do you get? Could you post a complete printout of what you're doing? Dave On Jun 11, 2007, at 11:45 AM, L Xu wrote: > I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > activeperl 5.8.8.819 Thank you very much. From johnsonm at gmail.com Tue Jun 12 00:45:13 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 11 Jun 2007 19:45:13 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) Message-ID: This bit in Bio::SeqFeature::Gene::Exon is causing me some problems trying to extend Bio::Tools::Glimmer to handle 'wraparound' genes (circular genomes): sub location { my ($self,$value) = @_; if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) { $self->throw("split or compound location is not allowed ". "for an object of type " . ref($self)); } return $self->SUPER::location($value); } That seems to be there all the way back to the initial revision (checked in by Hilmar). I presume it's there because of code like this ( from the seq() method in Bio::SeqFeature::Generic): # assumming our seq object is sensible, it should not have to yank # the entire sequence out here. my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); That's not going to work too well with a feature that has a Bio::Location::Split location. Fixing it up seems straightforward, if a bit hackish. Something like: my $seq; if (ref($self->location()) eq 'Bio::Location::Split')) { my $seqstring; my @sublocs = $self->location()->sub_Location(); foreach my $subloc (@sublocs) { $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), $subloc->end())->seq(); } my $seq = Bio::Seq->new( -id => $self->{'_gsf_seq'}->display_id(), -seq => $seqstring ); } else { $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end()); } I don't see any companion to trunc() in Bio::PrimarySeqI for joining sequences. A join() would be handy, and make the above cleaner. Comments, suggestions, rotten fruit? From torsten.seemann at infotech.monash.edu.au Tue Jun 12 06:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 12 Jun 2007 16:18:27 +1000 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: Mark, > if (ref($self->location()) eq 'Bio::Location::Split')) { > my $seqstring; > my @sublocs = $self->location()->sub_Location(); > > foreach my $subloc (@sublocs) { > $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(), > $subloc->end())->seq(); > } Can you use the ->spliced_seq() method to do this? http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From pengchy at yahoo.com.cn Tue Jun 12 07:00:46 2007 From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=) Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST) Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com> hi all, Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141 Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, < DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. shell returned 2 when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond: TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr x/PWM.pm line 141, line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, line 206. Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, DATA> line 206. BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, line 206. Compilation failed in require at script1.pl line 3, line 206. BEGIN failed--compilation aborted at script1.pl line 3, line 206. Is anyone else meet the same problem? Is it a bug for TFBS package? Best wishes! Sincerely, Pengcheng --------------------------------- ????????3.5G???20M??? From bix at sendu.me.uk Tue Jun 12 07:32:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 12 Jun 2007 08:32:02 +0100 Subject: [Bioperl-l] Can't locate loadable object for module TFBS::Ext::pwmsearch In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com> References: <66745.92089.qm@web15205.mail.cnb.yahoo.com> Message-ID: <466E4BF2.7020504@sendu.me.uk> ? ?? wrote: > hi all, > > Today, I download the TFBS package from > http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the > files contained in the TFBS and Ext directories to directory > "C:\perl\site\lib", then put Ext under the TFBS directory. I run the > example script1.pl, but a wrong message respond: > > Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC You have to follow the installation instructions in the README file. Copying the files out is insufficient - you have to 'make'. From ryanx07 at hotmail.com Tue Jun 12 11:30:09 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 06:30:09 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu> Message-ID: Here is the code: use Bio::Perl; $seq_object = get_sequence('swiss',"ROA1_HUMAN"); print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; write_sequence(">roa1.fasta",'fasta',$seq_object); The output looks like the same as the previous version: Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\~Scripts>perl test.pl ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 3 STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 STACK: test.pl:7 ----------------------------------------------------------- Thanks. >From: David Messina >To: L Xu >CC: BioPerl list >Subject: Re: [Bioperl-l] basic questions >Date: Mon, 11 Jun 2007 13:48:23 -0500 > >Hi, > >Please use 'Reply All' so everyone on the list can follow the discussion. > >Try adding the following line after the line that starts with $seq_object: > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > >And then run the program again. What do you get? Could you post a complete >printout of what you're doing? > > >Dave > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and >>activeperl 5.8.8.819 Thank you very much. > _________________________________________________________________ Picture this ? share your photos and you could win big! http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us From pengchy at yahoo.com.cn Tue Jun 12 14:33:15 2007 From: pengchy at yahoo.com.cn (Pengcheng Yang) Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?= In-Reply-To: Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com> I got the same questions. I guess that the swissprote database has some problems! code: use Bio::DB::SwissProt; $sp = new Bio::DB::SwissProt; $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" the mesage: ------------- EXCEPTION ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180 STACK Bio::DB::WebDBSeqI::get_Seq_by_id C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154 STACK toplevel t.pl:7 -------------------------------------- --- L Xu ??: > Here is the code: > > use Bio::Perl; > $seq_object = get_sequence('swiss',"ROA1_HUMAN"); > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > write_sequence(">roa1.fasta",'fasta',$seq_object); > > The output looks like the same as the previous version: > > Microsoft Windows XP [Version 5.1.2600] > (C) Copyright 1985-2001 Microsoft Corp. > > C:\~Scripts>perl test.pl > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK: Error::throw > STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350 > STACK: Bio::SeqIO::swiss::next_seq > C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id > C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15 > 3 > STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510 > STACK: test.pl:7 > ----------------------------------------------------------- > > Thanks. > > > > > > >From: David Messina > >To: L Xu > >CC: BioPerl list > >Subject: Re: [Bioperl-l] basic questions > >Date: Mon, 11 Jun 2007 13:48:23 -0500 > > > >Hi, > > > >Please use 'Reply All' so everyone on the list can follow the > discussion. > > > >Try adding the following line after the line that starts with > $seq_object: > > > > print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n"; > > > >And then run the program again. What do you get? Could you post a > complete > >printout of what you're doing? > > > > > >Dave > > > > > >On Jun 11, 2007, at 11:45 AM, L Xu wrote: > >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and > >>activeperl 5.8.8.819 Thank you very much. > > > > _________________________________________________________________ > Picture this ?share your photos and you could win big! > http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Best wishes! Sincerely, Pengcheng ___________________________________________________________ ????????3.5G???20M??? http://cn.mail.yahoo.com From drummike at gmail.com Tue Jun 12 15:49:36 2007 From: drummike at gmail.com (Mike Williams) Date: Tue, 12 Jun 2007 11:49:36 -0400 Subject: [Bioperl-l] =?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?= In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com> References: <936780.8655.qm@web15215.mail.cnb.yahoo.com> Message-ID: On 6/12/07, Pengcheng Yang wrote: > I got the same questions. > I guess that the swissprote database has some problems! > code: > use Bio::DB::SwissProt; > $sp = new Bio::DB::SwissProt; > $seq = $sp->get_Seq_by_id('KPY1_ECOLI'); > print ref($seq),"\t",$seq->display_id,"\n" > ------------- EXCEPTION ------------- > MSG: swissprot stream with no ID. Not swissprot in my book > STACK toplevel t.pl:7 This is a different problem. The id was not valid. If you change KPY1 to KPYK1 it works fine. $seq = $sp->get_Seq_by_id('KPYK1_ECOLI'); print ref($seq),"\t",$seq->display_id,"\n" [mike at Wheatley]$ ./bio_quest2.pl Bio::Seq::RichSeq KPYK1_ECOLI If you got this example from the bio perl site would you please post the url? Seems to me this same problem has come up before, but I could not find it in the archives nor on the web site. Mike From ryanx07 at hotmail.com Tue Jun 12 15:42:28 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 10:42:28 -0500 Subject: [Bioperl-l] basic questions Message-ID: I tested another code (the 2nd test on the same machine) from the tutorial and got error again. I don't know what happened and please help. Thanks so much. ===========================================================Code: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection; my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; # prints name, recognition site, overhang } =========================================== Results: C:\~Scripts>perl t9.pl Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while "stric t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 236. = = = Original message = = = On Jun 11, 2007, at 11:45 AM, L Xu wrote: I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? activeperl 5.8.8.819 Thank you very much. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Need a break? Find your escape route with Live Search Maps. http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01 From limericksean at gmail.com Tue Jun 12 16:04:40 2007 From: limericksean at gmail.com (Sean O'Keeffe) Date: Tue, 12 Jun 2007 18:04:40 +0200 Subject: [Bioperl-l] gff2xml Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Hi all, I posted this on the gbrowse list earlier. I'm looking to convert gff data files into xml. Does anyone know of a module written to do this already? respect, sean. From johnsonm at gmail.com Tue Jun 12 16:10:45 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:10:45 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On 6/12/07, Torsten Seemann wrote: > Can you use the ->spliced_seq() method to do this? > > http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11 > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > --Tel +61 3 9905 9010 Actually, I'd forgotten about spliced_seq(). That seems like it will Do The Right Thing. It's just up to the invoker to call spliced_seq() instead of seq() as appropriate. So, is there any other code that will break if I modify Bio::SeqFeature::Gene::Exon::location to not throw an exception when encountering Bio::Location::SplitLocationI? I'm wondering if it's just a paranoid check or if it's there to guard against something. If the latter, I need to know what code to fix. I'll dig and look, but if anybody knows or has an idea, save me some time. I suppose I can just change it and see what tests start failing. 8) From dmessina at wustl.edu Tue Jun 12 16:11:36 2007 From: dmessina at wustl.edu (David Messina) Date: Tue, 12 Jun 2007 11:11:36 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu> Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps Perl wasn't seeing the second argument to get_sequence. And then your new program has the error 'Can't use string ("Bio::Restriction::EnzymeCollecti")' where the end of the word is cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks. Are there any example scripts that come with ActivePerl? If there are, and they run correctly, perhaps you could look to see how the line breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem -- anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl and make sure that you run the full test suite and that all of the tests pass. My guess is that something in your current setup is not quite right. Dave From cjfields at uiuc.edu Tue Jun 12 16:42:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 11:42:29 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs state that the Exon class is used to specifically describe exons, as the name implies. Exons are primarily eukaryotic in origin, so you shouldn't encounter wraparounds, and should not have split locations by definition (which likely explains the exception). Wouldn't a SeqFeature::Generic work just as well using a split location? chris From johnsonm at gmail.com Tue Jun 12 16:59:54 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 11:59:54 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: That's a good point. Both Bio::Tools::Glimmer and Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with a single Bio::SeqFeature::Gene::Exon, when parsing predictions for prokaryotic sequence (multiple exons for eukaryotic). There are eukaryotic and prokaryotic versions of both predictor families. Maybe the most elegant solution would be to simply modify both modules to only emit Bio::SeqFeature::Generic features when operating on prokaryotic mode output? Fix the data model and the problem goes away. 8) On 6/12/07, Chris Fields wrote: > > On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: > > > On 6/12/07, Torsten Seemann > > wrote: > >> Can you use the ->spliced_seq() method to do this? > >> > >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ > >> SeqFeatureI.html#POD11 > >> > >> -- > >> --Torsten Seemann > >> --Victorian Bioinformatics Consortium, Monash University > >> --Tel +61 3 9905 9010 > > > > Actually, I'd forgotten about spliced_seq(). That seems like it > > will Do The Right Thing. It's just up to the invoker to call > > spliced_seq() instead of seq() as appropriate. > > So, is there any other code that will break if I modify > > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > > encountering Bio::Location::SplitLocationI? I'm wondering if it's > > just a paranoid check or if it's there to guard against something. If > > the latter, I need to know what code to fix. I'll dig and look, but > > if anybody knows or has an idea, save me some time. I suppose I can > > just change it and see what tests start failing. 8) > > I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to > describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs > state that the Exon class is used to specifically describe exons, as > the name implies. Exons are primarily eukaryotic in origin, so you > shouldn't encounter wraparounds, and should not have split locations > by definition (which likely explains the exception). > > Wouldn't a SeqFeature::Generic work just as well using a split location? > > chris > From ryanx07 at hotmail.com Tue Jun 12 17:17:18 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 12:17:18 -0500 Subject: [Bioperl-l] basic questions Message-ID: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820. However, both scripts generated the same error with my computer. I tested the code in another WinXP computer with the same versions of activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there any example scripts that come with ActivePerl? If there are,? and they run correctly, perhaps you could look to see how the line? breaks are done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall BioPerl? and make sure that you run the full test suite and that all of the? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 17:51:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 12:51:47 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: This is an instance where 'use strict' would have shown the problem right away. You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: > I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 > build 820. > However, both scripts generated the same error with my computer. I > tested > the code in another WinXP computer with the same versions of > activePerl and > BioPerl, the one for the swissprot did work but the restriction enzyme > generated the same error. > > = = = Original message = = = > > Hmm, it almost looks like you're having an issue with line breaks. > > The 'swissprot stream with no ID' error made me think that perhaps? > Perl > wasn't seeing the second argument to get_sequence. And then your? new > program has the error 'Can't use string? > ("Bio::Restriction::EnzymeCollecti")' where the end of the word is? > cut off. > > I don't know how ActivePerl handles Windows vs UNIX line breaks.? > Are? there > any example scripts that come with ActivePerl? If there are,? and > they run > correctly, perhaps you could look to see how the line? breaks are > done and > make sure the your program does it the same way. > > Other than that, I'm not seeing an obvious answer to your problem > --? anyone > else have a suggestion? > > Perhaps the easiest thing for you to do would be to reinstall > BioPerl? and > make sure that you run the full test suite and that all of the? > tests pass. > My guess is that something in your current setup is not? quite right. > > Dave > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only > on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Tue Jun 12 18:11:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Tue, 12 Jun 2007 13:11:15 -0500 Subject: [Bioperl-l] basic questions Message-ID: Thank you very much, it did make the script advanced a bit but I got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the package. Thanks. = = = Original message = = = This is an instance where 'use strict' would have shown the problem? right away.? You left off your constructor call: my $all_collection = Bio::Restriction::EnzymeCollection; should be my $all_collection = Bio::Restriction::EnzymeCollection->new; chris On Jun 12, 2007, at 12:17 PM, L Xu wrote: I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 820. However, both scripts generated the same error with my computer. I? tested the code in another WinXP computer with the same versions of? activePerl and BioPerl, the one for the swissprot did work but the restriction enzyme generated the same error. = = = Original message = = = Hmm, it almost looks like you're having an issue with line breaks. The 'swissprot stream with no ID' error made me think that perhaps?? Perl wasn't seeing the second argument to get_sequence. And then your? new program has the error 'Can't use string? ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut off. I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? there any example scripts that come with ActivePerl? If there are,? and? they run correctly, perhaps you could look to see how the line? breaks are? done and make sure the your program does it the same way. Other than that, I'm not seeing an obvious answer to your problem? --? anyone else have a suggestion? Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and make sure that you run the full test suite and that all of the?? tests pass. My guess is that something in your current setup is not? quite right. Dave ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only? on MSN http://liveearth.msn.com?source=msntaglineliveearthhm _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Tue Jun 12 18:35:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 13:35:15 -0500 Subject: [Bioperl-l] basic questions In-Reply-To: References: Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu> Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme objects, each with its own name(). Using grouped methods like '$collection->cutters(6)' will retrieve a new EnzymeCollection containing all six-cutters from the original collection. You should use one of the EnzymeCollection accessor methods to retrieve the enzyme that you wanted first or iterate through them all. This works for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; } chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: > Thank you very much, it did make the script advanced a bit but I > got the following error: > > C:\~Scripts>perl t9.pl > Can't locate object method "name" via package > "Bio::Restriction::EnzymeCollectio > n" at t9.pl line 5, line 532. > > I checked the documentation , there is no "name" method for the > package. Thanks. From johnsonm at gmail.com Tue Jun 12 19:07:57 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 12 Jun 2007 14:07:57 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: I'll wait a day, and if there is no opinion to the contrary, implement it this way. On 6/12/07, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) From torsten.seemann at infotech.monash.edu.au Wed Jun 13 00:18:27 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 13 Jun 2007 10:18:27 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: Sean > I posted this on the gbrowse list earlier. I'm looking to convert gff > data files into xml. Does anyone know of a module written to do this > already? What DTD do you want the XML to conform to? eg. ChadoXML, TinySeq XML, TIGR XML ... ? What program are you trying to get to load the XML? BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that you could use. There is a script "bp_seqconvert.pl -h" which comes with BioPerl which may be useful. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From hlapp at gmx.net Wed Jun 13 00:55:57 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:55:57 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net> I think it was just trying to guard against people trying to do stupid things. I'm actually not sure that representing locations on a circular genome using split locations really is the best thing. I'm wondering whether one shouldn't rather introduce a CircularLocation object (though obviously it isn't the location that's circular...). Just a thought. In the end, if you have a way to make this work that you feel comfortable with than go for it. -hilmar On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote: > On 6/12/07, Torsten Seemann > wrote: >> Can you use the ->spliced_seq() method to do this? >> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >> SeqFeatureI.html#POD11 >> >> -- >> --Torsten Seemann >> --Victorian Bioinformatics Consortium, Monash University >> --Tel +61 3 9905 9010 > > Actually, I'd forgotten about spliced_seq(). That seems like it > will Do The Right Thing. It's just up to the invoker to call > spliced_seq() instead of seq() as appropriate. > So, is there any other code that will break if I modify > Bio::SeqFeature::Gene::Exon::location to not throw an exception when > encountering Bio::Location::SplitLocationI? I'm wondering if it's > just a paranoid check or if it's there to guard against something. If > the latter, I need to know what code to fix. I'll dig and look, but > if anybody knows or has an idea, save me some time. I suppose I can > just change it and see what tests start failing. 8) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 13 00:57:06 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Jun 2007 20:57:06 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: References: Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> I like that. Don't force a model to do what you want if it doesn't really apply anyway. -hilmar On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > That's a good point. Both Bio::Tools::Glimmer and > Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with > a single Bio::SeqFeature::Gene::Exon, when parsing predictions for > prokaryotic sequence (multiple exons for eukaryotic). There are > eukaryotic and prokaryotic versions of both predictor families. Maybe > the most elegant solution would be to simply modify both modules to > only emit Bio::SeqFeature::Generic features when operating on > prokaryotic mode output? Fix the data model and the problem goes > away. 8) > > On 6/12/07, Chris Fields wrote: >> >> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >> >>> On 6/12/07, Torsten Seemann >>> wrote: >>>> Can you use the ->spliced_seq() method to do this? >>>> >>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>> SeqFeatureI.html#POD11 >>>> >>>> -- >>>> --Torsten Seemann >>>> --Victorian Bioinformatics Consortium, Monash University >>>> --Tel +61 3 9905 9010 >>> >>> Actually, I'd forgotten about spliced_seq(). That seems like it >>> will Do The Right Thing. It's just up to the invoker to call >>> spliced_seq() instead of seq() as appropriate. >>> So, is there any other code that will break if I modify >>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when >>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>> just a paranoid check or if it's there to guard against >>> something. If >>> the latter, I need to know what code to fix. I'll dig and look, but >>> if anybody knows or has an idea, save me some time. I suppose I can >>> just change it and see what tests start failing. 8) >> >> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >> state that the Exon class is used to specifically describe exons, as >> the name implies. Exons are primarily eukaryotic in origin, so you >> shouldn't encounter wraparounds, and should not have split locations >> by definition (which likely explains the exception). >> >> Wouldn't a SeqFeature::Generic work just as well using a split >> location? >> >> chris >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 01:20:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 12 Jun 2007 20:20:41 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when encountering split location (Bio::Location::Split) In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> References: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net> Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu> It will be interesting to see if bioperl handles wrap-around split locations via spliced_seq() and other methods. I can't see why it wouldn't but one never knows. Might be something to add to location tests at some point... chris On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote: > I like that. Don't force a model to do what you want if it doesn't > really apply anyway. > > -hilmar > > On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote: > >> That's a good point. Both Bio::Tools::Glimmer and >> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with >> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for >> prokaryotic sequence (multiple exons for eukaryotic). There are >> eukaryotic and prokaryotic versions of both predictor families. >> Maybe >> the most elegant solution would be to simply modify both modules to >> only emit Bio::SeqFeature::Generic features when operating on >> prokaryotic mode output? Fix the data model and the problem goes >> away. 8) >> >> On 6/12/07, Chris Fields wrote: >>> >>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote: >>> >>>> On 6/12/07, Torsten Seemann >>>> wrote: >>>>> Can you use the ->spliced_seq() method to do this? >>>>> >>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ >>>>> SeqFeatureI.html#POD11 >>>>> >>>>> -- >>>>> --Torsten Seemann >>>>> --Victorian Bioinformatics Consortium, Monash University >>>>> --Tel +61 3 9905 9010 >>>> >>>> Actually, I'd forgotten about spliced_seq(). That seems >>>> like it >>>> will Do The Right Thing. It's just up to the invoker to call >>>> spliced_seq() instead of seq() as appropriate. >>>> So, is there any other code that will break if I modify >>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception >>>> when >>>> encountering Bio::Location::SplitLocationI? I'm wondering if it's >>>> just a paranoid check or if it's there to guard against >>>> something. If >>>> the latter, I need to know what code to fix. I'll dig and look, >>>> but >>>> if anybody knows or has an idea, save me some time. I suppose I >>>> can >>>> just change it and see what tests start failing. 8) >>> >>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to >>> describe the 'wrap-around' genes. The SeqFeature::Gene::Exon docs >>> state that the Exon class is used to specifically describe exons, as >>> the name implies. Exons are primarily eukaryotic in origin, so you >>> shouldn't encounter wraparounds, and should not have split locations >>> by definition (which likely explains the exception). >>> >>> Wouldn't a SeqFeature::Generic work just as well using a split >>> location? >>> >>> chris >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From ryanx07 at hotmail.com Wed Jun 13 12:16:15 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 07:16:15 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: Thanks so much, Chris, it works now. All the codes I tested were copied from Bioperl Tutorial. Why did they have such problems, because of the platform issue or different versions of BioPerl? I tested so far 6 scripts, three work and three don't. Here is the problem for the 3rd failed script: ================================= use strict; use Bio::Tools::Run::RemoteBlast; my $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); my $r = $remote_blast->submit_blast("d1.fa"); my $rc; while ( my @rids = $remote_blast->each_rid ) { for my $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); } } print "$rc\n"; #I just want to print sth here before parsing the result =========================================================d1.fa >example CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC =========================================================result C:\>perl t13.pl -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- Terminating on signal SIGINT(2) C:\> Please help me to correct the problem, thanks. = = = Original message = = = Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, each with its own name().? Using grouped methods like? '$collection->cutters(6)' will retrieve a new EnzymeCollection? containing all six-cutters from the original collection.? You should? use one of the EnzymeCollection accessor methods to retrieve the? enzyme that you wanted first or iterate through them all.? This works? for me: use Bio::Restriction::EnzymeCollection; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $six_cutter_collection = $all_collection->cutters(6); for my $enz ($six_cutter_collection->each_enzyme) ?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; chris On Jun 12, 2007, at 1:11 PM, L Xu wrote: Thank you very much, it did make the script advanced a bit but I? got the following error: C:\~Scripts>perl t9.pl Can't locate object method "name" via package? "Bio::Restriction::EnzymeCollectio n" at t9.pl line 5, line 532. I checked the documentation , there is no "name" method for the? package. Thanks. ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Make every IM count. Download Messenger and join the i?m Initiative now. It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07 From cjfields at uiuc.edu Wed Jun 13 14:41:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 09:41:55 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu> Judging by the output it looks like you have no network access or can't connect to the server (what remoteblast needs). Make sure you don't need proxy settings. To preempt the next question, no, I'm not going to explain what a proxy is. The RemoteBlast docs show how to set them, and Google is a wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... From ryanx07 at hotmail.com Wed Jun 13 15:01:07 2007 From: ryanx07 at hotmail.com (L Xu) Date: Wed, 13 Jun 2007 10:01:07 -0500 Subject: [Bioperl-l] Example code in Bioperl Tutorial Message-ID: I do have the internet connection bu not use the proxy server. I tested the network connection with ping command (below). The ncbi website does not response. Is there any special network setting needed for connecting the ncbi website? Thank you so much. C:\>ping www.yahoo.com Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 Ping statistics for 69.147.114.210: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 312ms, Maximum = 363ms, Average = 338ms C:\>ping www.ncbi.nlm.nih.gov Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: Request timed out. Request timed out. Request timed out. Request timed out. Ping statistics for 130.14.29.110: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), = = = Original message = = = Judging by the output it looks like you have no network access or? can't connect to the server (what remoteblast needs).? Make sure you? don't need proxy settings. To preempt the next question, no, I'm not going to explain what a? proxy is.? The RemoteBlast docs show how to set them, and Google is a? wonderful tool... chris On Jun 13, 2007, at 7:16 AM, L Xu wrote: ... -------------------- WARNING --------------------- MSG: An Error Occurred

An Error Occurred

500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) --------------------------------------------------- ... ___________________________________________________________ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. _________________________________________________________________ Get a preview of Live Earth, the hottest event this summer - only on MSN http://liveearth.msn.com?source=msntaglineliveearthhm From cjfields at uiuc.edu Wed Jun 13 16:14:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 11:14:22 -0500 Subject: [Bioperl-l] method naming Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Some quick questions on method naming. I couldn't find this on the mail list previously and just want some opinions. 1) Is there any preference on how to name a method that returns a list of class instances vs. data? I have seen 'each' (each_Location, each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. simple (hits, hsps). 2) Do we want have methods which return objects have the object name in Title Case (each_Location, get_Seq_by_id, etc) or does it really matter? chris From dmessina at wustl.edu Wed Jun 13 16:41:53 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 13 Jun 2007 11:41:53 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). I'd prefer 'get_all' because it's more intuitive to me what the method is doing. 'Each' is too programmer-y. > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? I like Title Case because it reinforces the notion that what you're getting back is a specific object with that name (Seq) rather than the generic thing that the name represents (AGTCTGTGATAT, the actual sequence as a string). Dave From hlapp at gmx.net Wed Jun 13 17:03:59 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 13:03:59 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> We set a convention a while back on how to name these. It is implemented in the bioperl.lisp file (too bad no one is using emacs any more these days - it's a great editor), and in fact we started a renaming campaign (not sure when that was) on the SeqI and SeqFeatureI classes (you'll still see the old names aliased). However, we never got to finish the clean up. The convention was to use get_{ClassName}s, and get_all_{ClassName}s if there is a difference to the former (mostly because of hierarchical data; for example features can be nested, and get_all_SeqFeatures returns them all flattened out, while get_SeqFeatures returns only the top objects), and for modifying add_ {ClassName} and remove_{ClassName}s. The class name was to be in title case to emphasize the fact that it is an array of object you'd be getting back (and what kind of objects). If it is strings or any other scalar type, the name would be in lower case. -hilmar On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > Some quick questions on method naming. I couldn't find this on the > mail list previously and just want some opinions. > > 1) Is there any preference on how to name a method that returns a > list of class instances vs. data? I have seen 'each' (each_Location, > each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. > simple (hits, hsps). > > 2) Do we want have methods which return objects have the object name > in Title Case (each_Location, get_Seq_by_id, etc) or does it really > matter? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 17:19:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 12:19:43 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: Sounds good. I agree with Dave also one the use of 'each', as it's a bit ambiguous (seems to imply iteration as opposed to returning a whole list). We probably need to post this somewhere on the wiki for future reference; maybe in Advanced BioPerl? I'll add this in shortly. chris On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), and in fact we started > a renaming campaign (not sure when that was) on the SeqI and > SeqFeatureI classes (you'll still see the old names aliased). > > However, we never got to finish the clean up. > > The convention was to use get_{ClassName}s, and get_all_{ClassName} > s if there is a difference to the former (mostly because of > hierarchical data; for example features can be nested, and > get_all_SeqFeatures returns them all flattened out, while > get_SeqFeatures returns only the top objects), and for modifying > add_{ClassName} and remove_{ClassName}s. > > The class name was to be in title case to emphasize the fact that > it is an array of object you'd be getting back (and what kind of > objects). If it is strings or any other scalar type, the name would > be in lower case. > > -hilmar > > On Jun 13, 2007, at 12:14 PM, Chris Fields wrote: > >> Some quick questions on method naming. I couldn't find this on the >> mail list previously and just want some opinions. >> >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> >> 2) Do we want have methods which return objects have the object name >> in Title Case (each_Location, get_Seq_by_id, etc) or does it really >> matter? >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Jun 13 18:43:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 13:43:41 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <467036FC.8000505@watson.wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> <467036FC.8000505@watson.wustl.edu> Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu> On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote: > > David Messina wrote: >>> 1) Is there any preference on how to name a method that returns a >>> list of class instances vs. data? I have seen >>> 'each' (each_Location, >>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) >>> vs. >>> simple (hits, hsps). >>> >> >> I'd prefer 'get_all' because it's more intuitive to me what the >> method is doing. 'Each' is too programmer-y. >> >> >> > When I think 'get_all', I think of a method that returns a list of > objects at once. When I think of 'each', I think of a method that > returns a scalar but can be called multiple times to iterate over a > set of objects. Yep, hence the ambiguity issue (and my confusion). I think it was so you could both iterate and return a list using this: for my $obj ($seq->each_Class) {...} my @objs = $seq->each_Class; I use 'next' and 'get/get_all' as an iterator and get accessor (similar to how it's used in Bio::SearchIO): while (my $obj = $seq->next_Class) {...} my @objs = $seq->get_Class; # or get_all_Class for flattened lists which to me is much clearer. chris From mkiwala at watson.wustl.edu Wed Jun 13 18:27:08 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Wed, 13 Jun 2007 13:27:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu> Message-ID: <467036FC.8000505@watson.wustl.edu> David Messina wrote: >> 1) Is there any preference on how to name a method that returns a >> list of class instances vs. data? I have seen 'each' (each_Location, >> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs. >> simple (hits, hsps). >> > > I'd prefer 'get_all' because it's more intuitive to me what the > method is doing. 'Each' is too programmer-y. > > > When I think 'get_all', I think of a method that returns a list of objects at once. When I think of 'each', I think of a method that returns a scalar but can be called multiple times to iterate over a set of objects. From sac at bioperl.org Wed Jun 13 21:17:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 13 Jun 2007 14:17:27 -0700 Subject: [Bioperl-l] method naming In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> On 6/13/07, Hilmar Lapp wrote: > We set a convention a while back on how to name these. It is > implemented in the bioperl.lisp file (too bad no one is using emacs > any more these days - it's a great editor), As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we could improve the visibility of bioperl.lisp. In truth, I had forgotten about it, though lit turns out I was loading an old version of it. (Btw, using the latest version of bioperl.lisp with xemacs 21.4.17, I don't get a bioperl menu item, though I can access bioperl functions via M-x. Suggestions?) I see bioperl.lisp is mentioned twice parenthetically in the advanced bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here would help. While we're at it, maybe we could add a bioperl.vi file to the distribution (if you can do such things with vi/vim). On 6/13/07, Chris Fields wrote: > We probably need to post this somewhere on the wiki for future > reference; maybe in Advanced BioPerl? I'll add this in shortly. Another idea: Add a method naming check to the set of audits we perform on CVS committed code. It could check for agreement with our conventions and warn if nothing was found (may not be a problem though). Steve From arareko at campus.iztacala.unam.mx Wed Jun 13 22:03:34 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 13 Jun 2007 17:03:34 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <467069B6.7080003@campus.iztacala.unam.mx> By the time of the 1.5.2 release, I jumped onto the idea of creating a BioPerl template for Komodo. Chris F handed me one he had already made but in the end I didn't had enough spare time to get into it. If someone wants to give it a try please let ChrisF/me know. Regards, Mauricio. Steve Chervitz wrote: > On 6/13/07, Hilmar Lapp wrote: >> We set a convention a while back on how to name these. It is >> implemented in the bioperl.lisp file (too bad no one is using emacs >> any more these days - it's a great editor), > > As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we > could improve the visibility of bioperl.lisp. In truth, I had > forgotten about it, though lit turns out I was loading an old version > of it. (Btw, using the latest version of bioperl.lisp with xemacs > 21.4.17, I don't get a bioperl menu item, though I can access bioperl > functions via M-x. Suggestions?) > > I see bioperl.lisp is mentioned twice parenthetically in the advanced > bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here > would help. While we're at it, maybe we could add a bioperl.vi file to > the distribution (if you can do such things with vi/vim). > > On 6/13/07, Chris Fields wrote: >> We probably need to post this somewhere on the wiki for future >> reference; maybe in Advanced BioPerl? I'll add this in shortly. > > Another idea: Add a method naming check to the set of audits we > perform on CVS committed code. It could check for agreement with our > conventions and warn if nothing was found (may not be a problem > though). > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From hlapp at gmx.net Wed Jun 13 22:41:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 18:41:45 -0400 Subject: [Bioperl-l] method naming In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > using the latest version of bioperl.lisp with xemacs 21.4.17, I > don't get a bioperl menu item I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item it showing up just beautifully. (BTW it also have very nice icons for various functions - though I always feel guilty for using keystrokes instead.) Is GNU Emacs finally winning this? ;) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Wed Jun 13 22:58:51 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 15:58:51 -0700 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Post your dualing screenshots to the wiki! I had started a couple of IDE pages on the wiki a while ago: http://bioperl.org/wiki/Emacs http://bioperl.org/wiki/Emacs_template http://bioperl.org/wiki/Vi If anyone is feeling excited enough to write a few more IDE pages and link them into a common article that would be great. -jason On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > > On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: > >> using the latest version of bioperl.lisp with xemacs 21.4.17, I >> don't get a bioperl menu item > > I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item > it showing up just beautifully. (BTW it also have very nice icons for > various functions - though I always feel guilty for using keystrokes > instead.) > > Is GNU Emacs finally winning this? ;) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Wed Jun 13 23:08:17 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:08:17 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: Would probably be worth writing one up for Komodo since Mauricio, Sendu, and I use it. I updated the Advanced BioPerl page with Hilmar's methods suggestions/ rules (as well as a few I found dating back a number of years on the mail list). It might be worth a glance in case there are any changes needed: http://www.bioperl.org/wiki/Advanced_BioPerl chris On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > Post your dualing screenshots to the wiki! > > I had started a couple of IDE pages on the wiki a while ago: > http://bioperl.org/wiki/Emacs > http://bioperl.org/wiki/Emacs_template > http://bioperl.org/wiki/Vi > > If anyone is feeling excited enough to write a few more IDE pages > and link them into a common article that would be great. > > -jason > On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: > >> >> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >> >>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>> don't get a bioperl menu item >> >> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item >> it showing up just beautifully. (BTW it also have very nice icons for >> various functions - though I always feel guilty for using keystrokes >> instead.) >> >> Is GNU Emacs finally winning this? ;) >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Wed Jun 13 23:28:17 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 13 Jun 2007 19:28:17 -0400 Subject: [Bioperl-l] method naming In-Reply-To: References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Thanks Chris for doing this - looks great. The only comment that I have is that method names should never start with a capital letter. If the getter/setter is for a single object (as opposed to a list), the name should probably be similar (if not identical) to the class being expected and returned, but lower-case. E.g., $feature->location(), $seq->species() etc -hilmar On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > Would probably be worth writing one up for Komodo since Mauricio, > Sendu, and I use it. > > I updated the Advanced BioPerl page with Hilmar's methods > suggestions/rules (as well as a few I found dating back a number of > years on the mail list). It might be worth a glance in case there > are any changes needed: > > http://www.bioperl.org/wiki/Advanced_BioPerl > > chris > > On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote: > >> Post your dualing screenshots to the wiki! >> >> I had started a couple of IDE pages on the wiki a while ago: >> http://bioperl.org/wiki/Emacs >> http://bioperl.org/wiki/Emacs_template >> http://bioperl.org/wiki/Vi >> >> If anyone is feeling excited enough to write a few more IDE pages >> and link them into a common article that would be great. >> >> -jason >> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote: >> >>> >>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote: >>> >>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I >>>> don't get a bioperl menu item >>> >>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu >>> item >>> it showing up just beautifully. (BTW it also have very nice icons >>> for >>> various functions - though I always feel guilty for using keystrokes >>> instead.) >>> >>> Is GNU Emacs finally winning this? ;) >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 13 23:44:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 18:44:08 -0500 Subject: [Bioperl-l] method naming In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu> <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net> <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com> <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org> <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net> Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu> Agreed. We can definitely add that in. As we edge towards another release we try another round of cleaning up. I wouldn't mind pushing out another 1.5 point release before summer's up if possible; most of the tough work was done for v.1.5.2 by Sendu. chris On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote: > Thanks Chris for doing this - looks great. The only comment that I > have is that method names should never start with a capital letter. > If the getter/setter is for a single object (as opposed to a list), > the name should probably be similar (if not identical) to the class > being expected and returned, but lower-case. > > E.g., $feature->location(), $seq->species() etc > > -hilmar > > On Jun 13, 2007, at 7:08 PM, Chris Fields wrote: > >> Would probably be worth writing one up for Komodo since Mauricio, >> Sendu, and I use it. >> >> I updated the Advanced BioPerl page with Hilmar's methods >> suggestions/rules (as well as a few I found dating back a number of >> years on the mail list). It might be worth a glance in case there >> are any changes needed: >> >> http://www.bioperl.org/wiki/Advanced_BioPerl >> >> chris ... From johncumbers at gmail.com Thu Jun 14 00:20:42 2007 From: johncumbers at gmail.com (John Cumbers) Date: Wed, 13 Jun 2007 20:20:42 -0400 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? Message-ID: Hello, I have a simple problem, I'm trying to search a genome sequence for a motif, I then want to output a BED file to display all the locations of this motif on the UCSC Genome Browser. I could not find a script to do this, so I started to write my own. I'm new to perl and my code below was my attempt to read the sequence string and output the index bp of the start of each motif. With this I could build the BED file myself, which requires start and finish base pairs. For the first motif I can output the start index, but when I try and read the next one off the sequence it does not work. Instead I just get an output of a list of 1's. I realise that this is more a request for some simple perl help, but any help much appreciated. Best wishes, John $seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta"); #turn my FASTA file into a seq object. $sequence_as_a_string = $seq_object->seq(); #turn it into a string # search $sequence_as_a_string string for motif AAA as example # if found, return the index that it is found at while ($sequence_as_a_string =~ m/AAA/g) { print "Found '$&'. Next attempt at character " . pos($sequence_as_a_string)+1 . "\n"; } -- John Cumbers, Graduate Student Biology and Medicine Brown University, Box G-W Providence, Rhode Island, 02912, USA Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 UK to USA: 0207 617 7824 From cjfields at uiuc.edu Thu Jun 14 01:58:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 13 Jun 2007 20:58:37 -0500 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: References: Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> This is answered in the FAQ (sorry if the URL wraps, but we don't like tinyurls): http://www.bioperl.org/wiki/ FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F chris On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > Hello, > > I have a simple problem, I'm trying to search a genome sequence for > a motif, > I then want to output a BED file to display all the locations of > this motif > on the UCSC Genome Browser. I could not find a script to do this, > so I > started to write my own. I'm new to perl and my code below was my > attempt > to read the sequence string and output the index bp of the start of > each > motif. With this I could build the BED file myself, which requires > start > and finish base pairs. > > For the first motif I can output the start index, but when I try > and read > the next one off the sequence it does not work. Instead I just get an > output of a list of 1's. I realise that this is more a request for > some > simple perl help, but any help much appreciated. > > Best wishes, > John > > > $seq_object = read_sequence > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > my FASTA file into a seq object. > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > # search $sequence_as_a_string string for motif AAA as example > # if found, return the index that it is found at > > while ($sequence_as_a_string =~ m/AAA/g) { > print "Found '$&'. Next attempt at character " . > pos($sequence_as_a_string)+1 . "\n"; > } > > > > -- > John Cumbers, Graduate Student > Biology and Medicine > Brown University, Box G-W > Providence, Rhode Island, 02912, USA > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > UK to USA: 0207 617 7824 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Jun 14 04:08:04 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 13 Jun 2007 21:08:04 -0700 Subject: [Bioperl-l] wiki bulk update Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org> I did a some bulk update of Module pages for new modules that had been created since we last setup these pages: I outlined a little bit of what it requires behind the scenes. http://bioperl.org/wiki/BioPerl:Module_pages -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From bix at sendu.me.uk Thu Jun 14 09:35:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 10:35:00 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() Message-ID: <46710BC4.3060302@sendu.me.uk> It is preferable to have ->new syntax over new Object syntax, as outlined here: http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules I propose making this syntax change in all Bioperl POD documentation, so that the bad syntax is no longer suggested/encouraged. Any objections? If not, I'll go ahead and commit the changes. (affects 907 modules in live) Cheers, Sendu. From bix at sendu.me.uk Thu Jun 14 10:01:02 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 11:01:02 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <467111DE.6060800@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > > I propose making this syntax change in all Bioperl POD documentation, Actually, I propose making the change to code as well. From hlapp at gmx.net Thu Jun 14 12:47:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 08:47:47 -0400 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net> Sounds fine to me. People do go by working examples, and I've seen inconsistent examples leading to confusion on the end of newbies. -hilmar On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Jun 14 12:55:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 07:55:18 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <467111DE.6060800@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk> Message-ID: Sounds fine by me. I may actually start tackling some of the feature/ annotation overloading stuff myself to see what happens (I'll drop a notice when that occurs). chris On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> >> I propose making this syntax change in all Bioperl POD documentation, > > Actually, I propose making the change to code as well. From tanzeem.mb at gmail.com Thu Jun 14 06:27:19 2007 From: tanzeem.mb at gmail.com (tanzeem) Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT) Subject: [Bioperl-l] Problem working with remoteblast submit method in webbrowser. Message-ID: <11114623.post@talk.nabble.com> I have a program which uses the Bio perl remoteblast module which compares a aminoacid fasta file with swissprot database. The submit_blast() method works successfully when run from commandline.But when the program is run from web browser it returns -1. I was trying to adapt the code from Remoteblast synopsis for my need. -- View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bix at sendu.me.uk Thu Jun 14 15:34:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 16:34:27 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46710BC4.3060302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> Message-ID: <46716003.2030302@sendu.me.uk> Sendu Bala wrote: > It is preferable to have ->new syntax over new Object syntax, as > outlined here: > http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules > > I propose making this syntax change in all Bioperl POD documentation, so > that the bad syntax is no longer suggested/encouraged. Any objections? > If not, I'll go ahead and commit the changes. > > (affects 907 modules in live) It was actually 515 modules & test scripts from live, 48 from run, 21 from db and 2 from network. Now committed. Before and after my changes these were failing: Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioGraphics.t 3 768 38 3 3-5 t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 1932 2106 t/Sopma.t 2 512 16 2 8 15 t/genbank.t 2 512 247 2 122-123 BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 (unintentional?). Sopma may not be a bug: results from server might have changed. genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 -> 1.164 not doing what the new tests expect. PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are you working on that, or can I fix those errors? Anyone care to look into those things? Cheers, Sendu. From cjfields at uiuc.edu Thu Jun 14 16:35:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 11:35:21 -0500 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: The genbank commit was mine so I'll look into it; may be that I hadn't finished up the bug work. If if have time I'll look into Sopma as well (unless you get to it first). chris On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- >> oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD >> documentation, so >> that the bad syntax is no longer suggested/encouraged. Any >> objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ---------------------------------------------------------------------- > --------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm > 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, > are > you working on that, or can I fix those errors? > > Anyone care to look into those things? > > Cheers, > Sendu. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Thu Jun 14 16:43:43 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:43:43 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <4671703F.4010109@sheffield.ac.uk> I'm just wondering if anyone passes their modules through perltidy in order for them to have the same look/feel? If so, do you have a .perltidyrc file? Also, is it worth running the Bioperl modules through it? Nath From n.haigh at sheffield.ac.uk Thu Jun 14 16:36:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 17:36:37 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716003.2030302@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> Message-ID: <46716E95.3090604@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> It is preferable to have ->new syntax over new Object syntax, as >> outlined here: >> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules >> >> I propose making this syntax change in all Bioperl POD documentation, so >> that the bad syntax is no longer suggested/encouraged. Any objections? >> If not, I'll go ahead and commit the changes. >> >> (affects 907 modules in live) > > It was actually 515 modules & test scripts from live, 48 from run, 21 > from db and 2 from network. > > Now committed. Before and after my changes these were failing: > > > Failed Test Stat Wstat Total Fail List of Failed > ------------------------------------------------------------------------------- > t/BioGraphics.t 3 768 38 3 3-5 > t/PodSyntax.t 9 2304 2195 9 378 614 660 1023 1197 1512 1558 > 1932 2106 > t/Sopma.t 2 512 16 2 8 15 > t/genbank.t 2 512 247 2 122-123 > > > BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136 > (unintentional?). > > Sopma may not be a bug: results from server might have changed. > > genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163 > -> 1.164 not doing what the new tests expect. > > PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are > you working on that, or can I fix those errors? > I can fix these - although I'm still trying to get my new Debian 4.0 system up-to-speed so it might take me a little while! RE the PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't installed. However, would it be better to have Test::Pod in t/lib so that it runs on the user's system during installation or leave it as is? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS 7olroF2e6+4I0biz6fWRmu4= =s3hK -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 14 17:15:24 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:15:24 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <4671703F.4010109@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> Message-ID: <467177AC.8060104@sendu.me.uk> Nathan S. Haigh wrote: > I'm just wondering if anyone passes their modules through perltidy in > order for them to have the same look/feel? If so, do you have a > .perltidyrc file? Also, is it worth running the Bioperl modules through it? I don't use it, but I was contemplating the same thing. Chris uses it from time to time and I think we have a similar taste in style. But we'd have to hammer something out that was agreeable to everyone. From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 17:19:42 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:19:42 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz> David Messina wrote: > Hi Martin, > > You're in luck -- the BioPerl core distribution includes two scripts > for doing just that: > > genbank2gff Somehow these scripts were not installed for me on Gentoo, but I have then in the cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database, or better to say I have no intent to install that unknown thing, seems like an overkill for my case. I just want to render a plasmid map. > genbank2gff3 This one seems more promising but still with current cvs checkout I get... $ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb # Input: stdin Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, line 125. $ $ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, line 7. Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, line 7. ID unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP. XX AC unknown; XX XX XX CC ApEinfo:methylated:0 ... Oh dear, I have just manually edited the files and still they are wrong? Oh no. :( > > Look in the scripts directory of the distro. > > Also, there is a *huge* amount of documentation and examples on the > BioPerl website. > > http://www.bioperl.org/wiki/HOWTOs You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-) > > Reading those, reading the FAQ, and searching the mailing list > archives are where I look first when I don't know how to do something > in BioPerl. > > > Dave > > -- > Dave Messina > Senior Analyst, Assembly Group > Genome Sequencing Center > Washington University > St. Louis, MO > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 99.gb URL: From mmokrejs at ribosome.natur.cuni.cz Thu Jun 14 17:23:28 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Thu, 14 Jun 2007 19:23:28 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> Message-ID: <46717990.6040509@ribosome.natur.cuni.cz> Martin MOKREJ? wrote: >> Also, there is a *huge* amount of documentation and examples on the >> BioPerl website. >> >> http://www.bioperl.org/wiki/HOWTOs > > You mean > http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File > ? ;-) $ perl embl2picture.pl ~/99.gb | display - Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, line 125. $ The plasmid is a circular DNA, why is the diagram in linear? ;-) Martin From bix at sendu.me.uk Thu Jun 14 17:03:34 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 14 Jun 2007 18:03:34 +0100 Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new() In-Reply-To: <46716E95.3090604@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <46716E95.3090604@sheffield.ac.uk> Message-ID: <467174E6.1090001@sendu.me.uk> Nathan S. Haigh wrote: >> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are >> you working on that, or can I fix those errors? > > I can fix these - although I'm still trying to get my new Debian 4.0 > system up-to-speed so it might take me a little while! RE the > PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't > installed. However, would it be better to have Test::Pod in t/lib so > that it runs on the user's system during installation or leave it as is? Leave it as is. Every-day users don't need to check the syntax of the pod. In fact, it really only needs to be done once, prior to packaging up a new release. From n.haigh at sheffield.ac.uk Thu Jun 14 17:32:37 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:32:37 +0100 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46717BB5.8000706@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> I'm just wondering if anyone passes their modules through perltidy in >> order for them to have the same look/feel? If so, do you have a >> .perltidyrc file? Also, is it worth running the Bioperl modules >> through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. A starting place maybe Perl Best Practices by Damian Conway: http://www.oreilly.com/catalog/perlbp/ The perltidyrc file can e found here: http://www.perlmonks.org/?node_id=485885 I also found this nice thread with some ideas, inc some code that causes emacs to auto-perltidy everything you use cperl-mode with. I don't use emacs myself, ut here's the link if anyone is interested: http://www.perlmonks.org/?node_id=516501 Nath From johnsonm at gmail.com Thu Jun 14 17:38:31 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 12:38:31 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <467177AC.8060104@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: The nice thing about Perl Tidy is that everybody can have their own config file. There could be a bioperl default config that gets applied at checkin time. Anybody that didn't like it could script checkouts to get run through their own config. Diffs might get a little hairy, but as long as you tidy before diffing, it shouldn't be too bad. Speaking of which....coding style is controversial enough, but since that's already been opened, what about CVS vs Subversion? 8) Some of the scripting for this sort of thing might be easer in Subversion. Though maybe something like Git would fit the developer model better (more support for distributed development). On 6/14/07, Sendu Bala wrote: > Nathan S. Haigh wrote: > > I'm just wondering if anyone passes their modules through perltidy in > > order for them to have the same look/feel? If so, do you have a > > .perltidyrc file? Also, is it worth running the Bioperl modules through it? > > I don't use it, but I was contemplating the same thing. Chris uses it > from time to time and I think we have a similar taste in style. > > But we'd have to hammer something out that was agreeable to everyone. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Thu Jun 14 17:39:39 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 14 Jun 2007 18:39:39 +0100 Subject: [Bioperl-l] cvs changes in working copy Message-ID: <46717D5B.5040108@sheffield.ac.uk> Not sure if I'm being dense or if it's because I've been working with svn recently, but - how do I get a list of files that are different in my working copy compared to the repository? Cheers Nath From cjfields at uiuc.edu Thu Jun 14 17:46:38 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 12:46:38 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: Is 99.gb supposed to be a GenBank file? And you're loading it into embl2picture (which I assume takes EMBL format files)? Without example code we can easily make the wrong assumptions (i.e. that this is user error and not a BioPerl problem). Also, I don't believe the feature plotting scripts plot circular chromosomes/plasmids. If you want this functionality you'll have to code it for yourself. chris On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > Martin MOKREJ? wrote: > >>> Also, there is a *huge* amount of documentation and examples on the >>> BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> >> You mean >> http://www.bioperl.org/wiki/ >> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature > Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature > Bio::Location::Simple=HASH(0x893ebac): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature > Bio::Location::Simple=HASH(0x893e720): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > > Error returned while evaluating value of 'description' option for > glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature > Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. > $ > > The plasmid is a circular DNA, why is the diagram in linear? ;-) > > Martin > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Jun 14 17:57:35 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 12:57:35 -0500 Subject: [Bioperl-l] Perltidy In-Reply-To: <46717BB5.8000706@sheffield.ac.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk> Message-ID: <4671818F.5040902@campus.iztacala.unam.mx> I think a consensus .perltidyrc could be placed in the source distribution. Mauricio. Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. > > A starting place maybe Perl Best Practices by Damian Conway: > http://www.oreilly.com/catalog/perlbp/ > > > The perltidyrc file can e found here: > http://www.perlmonks.org/?node_id=485885 > > I also found this nice thread with some ideas, inc some code that causes > emacs to auto-perltidy everything you use cperl-mode with. I don't use > emacs myself, ut here's the link if anyone is interested: > http://www.perlmonks.org/?node_id=516501 > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Thu Jun 14 18:32:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 13:32:41 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: To chip in on this, I only use perltidy when I need to clean bioperl code up for debugging (particularly if blocks are hard to see) and just use the defaults. I agree it would be nice to have everything tidied up but it'll definitely need to be a consensus config file. About svn, I like the idea of eventually migrating to using it over CVS (I think BioPython and BioJava have plans to but I'm not sure) but I don't really know enough to say how feasible/difficult the migration path would be. Anyone know? chris On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > The nice thing about Perl Tidy is that everybody can have their > own config file. There could be a bioperl default config that gets > applied at checkin time. Anybody that didn't like it could script > checkouts to get run through their own config. Diffs might get a > little hairy, but as long as you tidy before diffing, it shouldn't be > too bad. Speaking of which....coding style is controversial enough, > but since that's already been opened, what about CVS vs Subversion? 8) > Some of the scripting for this sort of thing might be easer in > Subversion. Though maybe something like Git would fit the developer > model better (more support for distributed development). > > On 6/14/07, Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> I'm just wondering if anyone passes their modules through >>> perltidy in >>> order for them to have the same look/feel? If so, do you have a >>> .perltidyrc file? Also, is it worth running the Bioperl modules >>> through it? >> >> I don't use it, but I was contemplating the same thing. Chris uses it >> from time to time and I think we have a similar taste in style. >> >> But we'd have to hammer something out that was agreeable to everyone. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnsonm at gmail.com Thu Jun 14 18:46:24 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Thu, 14 Jun 2007 13:46:24 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: If there was a default/standard/consensus bioperl perltidy config file, I would probably use it prior to checkin, on my own, so I could code in my schizophrenic style without worrying about starting any format wars. When I'm fixing or enhancing somebody else's code, I always try and adapt to whatever style they used, even if it grates on my nerves. I'd love to not have to worry about that with Bioperl. Of course, nobody will every agree on a standard, so it's probably a moot point. 8) On 6/14/07, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > > chris From jason at bioperl.org Thu Jun 14 19:00:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:00:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > To chip in on this, I only use perltidy when I need to clean bioperl > code up for debugging (particularly if blocks are hard to see) and > just use the defaults. I agree it would be nice to have everything > tidied up but it'll definitely need to be a consensus config file. > Can we do any sort of massive conversion at some logical timepoint. Probably after a branch release or something? Because it basically means we're going to have differences on nearly every line which is going to make diff-ing difficult when debugging old/new versions. Maybe it is not a problem because we aren't introducing and new bugs! > About svn, I like the idea of eventually migrating to using it over > CVS (I think BioPython and BioJava have plans to but I'm not sure) > but I don't really know enough to say how feasible/difficult the > migration path would be. Anyone know? > It's doable but non-trivial. cvs2svn (python gah!) script exists to help in this. There are pros and cons to converting. There is a fair amount of documentation and other pointers out there that point to the CVS server for getting latest code so we'd need to think about whether we'd support some sort of backwards compatible SVN -> CVS for read-only or what. Mostly it will need someone to lead the charge - I made a go at doing it in the winter, but I really don't have the SVN-foo to make this work. We'd need someone with SVN experience to step up and help. You can always try and we can play with the converted repository for a while without making it the new code base. -j > chris > > On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote: > >> The nice thing about Perl Tidy is that everybody can have their >> own config file. There could be a bioperl default config that gets >> applied at checkin time. Anybody that didn't like it could script >> checkouts to get run through their own config. Diffs might get a >> little hairy, but as long as you tidy before diffing, it shouldn't be >> too bad. Speaking of which....coding style is controversial enough, >> but since that's already been opened, what about CVS vs >> Subversion? 8) >> Some of the scripting for this sort of thing might be easer in >> Subversion. Though maybe something like Git would fit the developer >> model better (more support for distributed development). >> >> On 6/14/07, Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> I'm just wondering if anyone passes their modules through >>>> perltidy in >>>> order for them to have the same look/feel? If so, do you have a >>>> .perltidyrc file? Also, is it worth running the Bioperl modules >>>> through it? >>> >>> I don't use it, but I was contemplating the same thing. Chris >>> uses it >>> from time to time and I think we have a similar taste in style. >>> >>> But we'd have to hammer something out that was agreeable to >>> everyone. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Thu Jun 14 19:01:27 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 14 Jun 2007 12:01:27 -0700 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: <46717D5B.5040108@sheffield.ac.uk> References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: cvs update | grep '^M' On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > Not sure if I'm being dense or if it's because I've been working with > svn recently, but - how do I get a list of files that are different in > my working copy compared to the repository? > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From cjfields at uiuc.edu Thu Jun 14 19:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 14 Jun 2007 14:20:46 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > > On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: > >> To chip in on this, I only use perltidy when I need to clean bioperl >> code up for debugging (particularly if blocks are hard to see) and >> just use the defaults. I agree it would be nice to have everything >> tidied up but it'll definitely need to be a consensus config file. >> > > Can we do any sort of massive conversion at some logical timepoint. > Probably after a branch release or something? Because it basically > means we're going to have differences on nearly every line which is > going to make diff-ing difficult when debugging old/new versions. > Maybe it is not a problem because we aren't introducing and new bugs! I agree; if we intend on doing this it should be all at once, maybe on a branch dedicated to ensure that code changes don't tank tests (they shouldn't but one never knows). We would then need a script up- and-running that tidies everything up prior to commits (though what happens if perltidy tanks?...). Sendu, up for it? >> About svn, I like the idea of eventually migrating to using it over >> CVS (I think BioPython and BioJava have plans to but I'm not sure) >> but I don't really know enough to say how feasible/difficult the >> migration path would be. Anyone know? >> > > It's doable but non-trivial. cvs2svn (python gah!) script exists to > help in this. There are pros and cons to converting. There is a > fair amount of documentation and other pointers out there that point > to the CVS server for getting latest code so we'd need to think about > whether we'd support some sort of backwards compatible SVN -> CVS for > read-only or what. > > Mostly it will need someone to lead the charge - I made a go at doing > it in the winter, but I really don't have the SVN-foo to make this > work. We'd need someone with SVN experience to step up and help. > You can always try and we can play with the converted repository for > a while without making it the new code base. > > -j Stepped into that one, didn't I! I'll look into how much effort is involved and try getting something going in the next month or two, maybe sooner if time permits. I'm lacking on SVN-foo as well but it might be worth looking into. chris From arareko at campus.iztacala.unam.mx Thu Jun 14 19:50:39 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 14 Jun 2007 14:50:39 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> About svn, I like the idea of eventually migrating to using it over >>> CVS (I think BioPython and BioJava have plans to but I'm not sure) >>> but I don't really know enough to say how feasible/difficult the >>> migration path would be. Anyone know? >>> >> It's doable but non-trivial. cvs2svn (python gah!) script exists to >> help in this. There are pros and cons to converting. There is a >> fair amount of documentation and other pointers out there that point >> to the CVS server for getting latest code so we'd need to think about >> whether we'd support some sort of backwards compatible SVN -> CVS for >> read-only or what. >> >> Mostly it will need someone to lead the charge - I made a go at doing >> it in the winter, but I really don't have the SVN-foo to make this >> work. We'd need someone with SVN experience to step up and help. >> You can always try and we can play with the converted repository for >> a while without making it the new code base. >> >> -j > > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. > > chris > Chris D has worked with CVS-SVN transitioning for other projects, maybe he can shed some light on this. Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From sac at bioperl.org Thu Jun 14 21:33:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Thu, 14 Jun 2007 14:33:39 -0700 Subject: [Bioperl-l] How can I pull out all instances of a motif from a genome sequence and output them as a BED file? In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> References: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu> Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com> This issue was discussed recently here. Check out this thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048 Some of the tools mentioned in the FAQ item Chris mentioned do not report where the match occurred, only that a match occurred (String::Approx, agrep), though some do report do report match locations (fuzznuc, fuzzprot; not sure about TFBS). My Bio::Tools::SeqPattern module does not even perform any matches, it just encapsulates a regular expression for a nuc or protein motif and knows how to handle ambiguity code expansion and reverse complementing. The idea is that you can use this to convert a biological sequence motif into a string suitable for use in a perl regex. Adding a match() method to this module would be handy. There an example script for it in examples/tools of the distro (which, btw references an obsolete module, so it won't run as is -- I'll fix). Steve On 6/13/07, Chris Fields wrote: > This is answered in the FAQ (sorry if the URL wraps, but we don't > like tinyurls): > > http://www.bioperl.org/wiki/ > FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. > 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F > > chris > > On Jun 13, 2007, at 7:20 PM, John Cumbers wrote: > > > Hello, > > > > I have a simple problem, I'm trying to search a genome sequence for > > a motif, > > I then want to output a BED file to display all the locations of > > this motif > > on the UCSC Genome Browser. I could not find a script to do this, > > so I > > started to write my own. I'm new to perl and my code below was my > > attempt > > to read the sequence string and output the index bp of the start of > > each > > motif. With this I could build the BED file myself, which requires > > start > > and finish base pairs. > > > > For the first motif I can output the start index, but when I try > > and read > > the next one off the sequence it does not work. Instead I just get an > > output of a list of 1's. I realise that this is more a request for > > some > > simple perl help, but any help much appreciated. > > > > Best wishes, > > John > > > > > > $seq_object = read_sequence > > ("Drosophila.Chr3.test.AE014296.fasta"); #turn > > my FASTA file into a seq object. > > $sequence_as_a_string = $seq_object->seq(); #turn it into a string > > # search $sequence_as_a_string string for motif AAA as example > > # if found, return the index that it is found at > > > > while ($sequence_as_a_string =~ m/AAA/g) { > > print "Found '$&'. Next attempt at character " . > > pos($sequence_as_a_string)+1 . "\n"; > > } > > > > > > > > -- > > John Cumbers, Graduate Student > > Biology and Medicine > > Brown University, Box G-W > > Providence, Rhode Island, 02912, USA > > Tel USA: +1 401 523 8190, Fax: +1 401 863-2166 > > UK to USA: 0207 617 7824 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Thu Jun 14 23:04:11 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 14 Jun 2007 19:04:11 -0400 Subject: [Bioperl-l] cvs changes in working copy In-Reply-To: References: <46717D5B.5040108@sheffield.ac.uk> Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net> Actually, that will update your repository. If you just wanted to take a peek you would use cvs status: $ cvs status | grep 'Locally Modified' -hilmar On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote: > cvs update | grep '^M' > > On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote: > >> Not sure if I'm being dense or if it's because I've been working with >> svn recently, but - how do I get a list of files that are >> different in >> my working copy compared to the repository? >> >> Cheers >> Nath >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From mmokrejs at ribosome.natur.cuni.cz Fri Jun 15 07:28:17 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Fri, 15 Jun 2007 09:28:17 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <46723F91.60501@ribosome.natur.cuni.cz> Chris Fields wrote: > Is 99.gb supposed to be a GenBank file? And you're loading it into Yes, it was attached to the email. ;) > embl2picture (which I assume takes EMBL format files)? Without example > code we can easily make the wrong assumptions (i.e. that this is user > error and not a BioPerl problem). use constant USAGE =>< Render a GenBank/EMBL entry into drawable form. Return as a GIF or PNG image on standard output. File must be in embl, genbank, or another SeqIO- recognized format. Only the first entry will be rendered. Example to try: embl2picture.pl factor7.embl | display - END > > Also, I don't believe the feature plotting scripts plot circular > chromosomes/plasmids. If you want this functionality you'll have to > code it for yourself. That's a pitty it does not, but at least if someone could improve the docs. ;) Unfortunately I don't have the time to rewrite the code myself now, I need a working, standalone, already available tool. :( M. > > chris > > On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote: > >> Martin MOKREJ? wrote: >> >>>> Also, there is a *huge* amount of documentation and examples on the >>>> BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> >>> ? ;-) >> >> $ perl embl2picture.pl ~/99.gb | display - >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature >> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature >> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature >> Bio::Location::Simple=HASH(0x893e720): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> >> Error returned while evaluating value of 'description' option for >> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature >> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line >> 141, line 125. >> $ >> >> The plasmid is a circular DNA, why is the diagram in linear? ;-) >> >> Martin >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > -- Dr. Martin Mokrejs Dept. of Genetics and Microbiology Faculty of Science, Charles University Vinicna 5, 128 43 Prague, Czech Republic http://www.iresite.org http://www.iresite.org/~mmokrejs From dhoworth at mrc-lmb.cam.ac.uk Fri Jun 15 08:59:09 2007 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Fri, 15 Jun 2007 09:59:09 +0100 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk> Martin MOKREJ? wrote: >>> Also, there is a *huge* amount of documentation and examples on >>> the BioPerl website. >>> >>> http://www.bioperl.org/wiki/HOWTOs >> You mean >> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >> ? ;-) > > $ perl embl2picture.pl ~/99.gb | display - Error returned while > evaluating value of 'description' option for glyph > Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature > Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method > "all_tags" via package "Bio::Location::Simple" at embl2picture.pl > line 141, line 125. Hmm an error at line 141 of a 69 line script? Methinks you're not actually running the script that's presented on the wiki page you quoted. I cut-and-pasted the script and your file and it worked for me (at least, it produced an image, along with a bunch of OOPS lines) HTH, Dave From n.haigh at sheffield.ac.uk Fri Jun 15 10:21:38 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:21:38 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726832.7080601@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a VPt4tEPLW2J+BiKnN3B8aV8= =c+9z -----END PGP SIGNATURE----- From bix at sendu.me.uk Fri Jun 15 10:07:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:07:04 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> Message-ID: <467264C8.4020202@sendu.me.uk> Chris Fields wrote: > On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote: > >> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote: >> >>> To chip in on this, I only use perltidy when I need to clean bioperl >>> code up for debugging (particularly if blocks are hard to see) and >>> just use the defaults. I agree it would be nice to have everything >>> tidied up but it'll definitely need to be a consensus config file. >>> >> Can we do any sort of massive conversion at some logical timepoint. >> Probably after a branch release or something? Because it basically >> means we're going to have differences on nearly every line which is >> going to make diff-ing difficult when debugging old/new versions. >> Maybe it is not a problem because we aren't introducing and new bugs! Sorry, can you clarify the problem you envisage? And why would making a branch release help? > I agree; if we intend on doing this it should be all at once, maybe > on a branch dedicated to ensure that code changes don't tank tests > (they shouldn't but one never knows). We would then need a script up- > and-running that tidies everything up prior to commits (though what > happens if perltidy tanks?...). > > Sendu, up for it? If its going to be difficult and a hassle, for such an unnecessary thing I'm not sure its worth it. There are more pressing things to be done for Bioperl. If I can just run perltidy on the entire package and commit, I'd do it. If that's not appropriate, I won't. >>> About svn [snip] > Stepped into that one, didn't I! I'll look into how much effort is > involved and try getting something going in the next month or two, > maybe sooner if time permits. I'm lacking on SVN-foo as well but it > might be worth looking into. I'd put this in the unnecessary-but-nice category as well. If it will be as easy as my ->new change, go ahead. If not, there are more pressing matters (POD fixing, test script updating and finishing...). From n.haigh at sheffield.ac.uk Fri Jun 15 10:35:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 11:35:40 +0100 Subject: [Bioperl-l] Installation using --install_base Message-ID: <46726B7C.7070902@sheffield.ac.uk> I'm setting up a new installation of Debian 4.0 at home and though I'd try to install BioPerl as a normal user rather than root. So in CPAN options I set the --install_base to /home/username/perl and set PERL5LIB to point to the same place. Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root user and ask to install all optional modules, it tries to install them through CPAN - however it seems to fail because some dependencies don't seem to want to install in a user directory. Has anyone else found this or might I be doing something wrong? Nath From bix at sendu.me.uk Fri Jun 15 10:45:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 15 Jun 2007 11:45:48 +0100 Subject: [Bioperl-l] Installation using --install_base In-Reply-To: <46726832.7080601@sheffield.ac.uk> References: <46726832.7080601@sheffield.ac.uk> Message-ID: <46726DDC.8090202@sendu.me.uk> Nathan S. Haigh wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I'm setting up a new installation of Debian 4.0 at home and though I'd > try to install BioPerl as a normal user rather than root. So in CPAN > options I set the --install_base to /home/username/perl and set PERL5LIB > to point to the same place. > > Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root > user and ask to install all optional modules, it tries to install them > through CPAN - however it seems to fail because some dependencies don't > seem to want to install in a user directory. > > Has anyone else found this or might I be doing something wrong? You'll need to configure CPAN to install into your user directory. Upgrade to the latest version, then go read the docs on the various configurable options. I thought I at least mentioned this in the Bioperl INSTALL doc. If not, can someone come up with a concise clarification? From sdavis2 at mail.nih.gov Fri Jun 15 10:56:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 06:56:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <46727048.3080904@mail.nih.gov> Sendu Bala wrote: > If its going to be difficult and a hassle, for such an unnecessary thing > I'm not sure its worth it. There are more pressing things to be done for > Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do it. > If that's not appropriate, I won't. I agree with the sentiment noted above. I'm a bit of an outsider here, but bioperl is a collaborative project. Not everyone has the same sentiments about what "correct" style means. As a programmer, I really wouldn't want significant changes on the style of my code. And perl happily puts up with many styles. I would say leave things as they are--let the individual programmers choose. It reduces the amount of work of questionable importance and allows the coding style freedom that perl supports. Just my $.02. Sean From cjfields at uiuc.edu Fri Jun 15 14:05:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:05:07 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> Message-ID: On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote: > Chris Fields wrote: >> Is 99.gb supposed to be a GenBank file? And you're loading it into > > Yes, it was attached to the email. ;) Sorry about that. I notice that '.' was added, but the spacing seemed off. I think bioperl catches that fine but it's something Wayne should consider. >> embl2picture (which I assume takes EMBL format files)? Without >> example >> code we can easily make the wrong assumptions (i.e. that this is user >> error and not a BioPerl problem). > > use constant USAGE =>< Usage: $0 > Render a GenBank/EMBL entry into drawable form. > Return as a GIF or PNG image on standard output. > > File must be in embl, genbank, or another SeqIO- > recognized format. Only the first entry will be > rendered. > > Example to try: > embl2picture.pl factor7.embl | display - > > END Horribly named script (should be seq2picture, since it converts both gb/embl). The use of 'all_tags' makes me think the script version you are using is old, as those methods have long since been renamed. Dave has it working though, so maybe your version has been updated? The 'use of initialized data in' errors are probably from inclusion of mandatory fields with no data or '.'. >> Also, I don't believe the feature plotting scripts plot circular >> chromosomes/plasmids. If you want this functionality you'll have to >> code it for yourself. > > That's a pitty it does not, but at least if someone could improve > the docs. ;) > Unfortunately I don't have the time to rewrite the code myself now, > I need a working, standalone, already available tool. :( > M. As I said, unless someone shows interest and codes it just won't get done. We have had very little interest in this, either b/c there are tools already out there to do this very thing (multitudes of plasmid drawing programs, some free like ApE) or that nobody's bothered to write it up. chris From cjfields at uiuc.edu Fri Jun 15 14:22:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:22:23 -0500 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <46727048.3080904@mail.nih.gov> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing >> I'm not sure its worth it. There are more pressing things to be >> done for >> Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd >> do it. >> If that's not appropriate, I won't. > > I agree with the sentiment noted above. I'm a bit of an outsider > here, > but bioperl is a collaborative project. Not everyone has the same > sentiments about what "correct" style means. As a programmer, I > really > wouldn't want significant changes on the style of my code. And perl > happily puts up with many styles. I would say leave things as they > are--let the individual programmers choose. It reduces the amount of > work of questionable importance and allows the coding style freedom > that > perl supports. > > Just my $.02. > > Sean I tend to run it on modules that need some reformatting (SearchIO::blast comes to mind). I believe you're correct when this comes down to programming style, but I think this echoes a sentiment (frustration, perhaps) that some of us have with long-term maintenance of said code. Maybe a compromise: include a copy of .perltidyrc with the distribution that goes by what a consensus wants or by the general rules laid out in Perl Best Practices (spaced settings, use of spaces over tabs, etc). Conversion would be encouraged but voluntary, with the caveat that if someone needs to clean up code down the road (bug fixes, enhancements, etc) and if the original author isn't able to add it in themselves, it could be perltidy'd in order to help the developer (locate and fix the issue)|(add relevant enhancement where needed). chris From cjfields at uiuc.edu Fri Jun 15 14:56:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 09:56:23 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467264C8.4020202@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> ... >>> Can we do any sort of massive conversion at some logical timepoint. >>> Probably after a branch release or something? Because it basically >>> means we're going to have differences on nearly every line which is >>> going to make diff-ing difficult when debugging old/new versions. >>> Maybe it is not a problem because we aren't introducing and new >>> bugs! > > Sorry, can you clarify the problem you envisage? And why would > making a branch release help? Maybe the worry is that mass conversion in such a large codebase could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o trying? >> I agree; if we intend on doing this it should be all at once, >> maybe on a branch dedicated to ensure that code changes don't >> tank tests (they shouldn't but one never knows). We would then >> need a script up- and-running that tidies everything up prior to >> commits (though what happens if perltidy tanks?...). >> Sendu, up for it? > > If its going to be difficult and a hassle, for such an unnecessary > thing I'm not sure its worth it. There are more pressing things to > be done for Bioperl. > > If I can just run perltidy on the entire package and commit, I'd do > it. If that's not appropriate, I won't. The choices aren't necessarily all or nothing. What about voluntary, recommended use of a perltidy config file included with the distribution, with additional 'caveats'? See my response to Sean. >>>> About svn > [snip] >> Stepped into that one, didn't I! I'll look into how much effort >> is involved and try getting something going in the next month or >> two, maybe sooner if time permits. I'm lacking on SVN-foo as >> well but it might be worth looking into. > > I'd put this in the unnecessary-but-nice category as well. If it > will be as easy as my ->new change, go ahead. If not, there are > more pressing matters (POD fixing, test script updating and > finishing...). A few other open-bio projects have actively discussed a CVS->SVN migration (BioRuby and I think BioPython, though the latter could be wrong). As I said, "it might be worth looking into" to weigh the pros/cons, get others opinions from others who have made the transition, etc. We could, as Jason suggested, even set up a tester SVN w/o making it the default codebase (lock it off to a few testers, have CVS commits automatically/manually carry over to SVN, etc). I agree with you that it's not feasible to switch over prior to a release and that there are more pressing issues, but it doesn't hurt having an open discussion about it. chris From sdavis2 at mail.nih.gov Fri Jun 15 15:15:57 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 15 Jun 2007 11:15:57 -0400 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD2D.2090001@mail.nih.gov> Chris Fields wrote: > > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary thing >>> I'm not sure its worth it. There are more pressing things to be done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do it. >>> If that's not appropriate, I won't. >> >> I agree with the sentiment noted above. I'm a bit of an outsider here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting (SearchIO::blast > comes to mind). I believe you're correct when this comes down to > programming style, but I think this echoes a sentiment (frustration, > perhaps) that some of us have with long-term maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the distribution > that goes by what a consensus wants or by the general rules laid out in > Perl Best Practices (spaced settings, use of spaces over tabs, etc). > Conversion would be encouraged but voluntary, with the caveat that if > someone needs to clean up code down the road (bug fixes, enhancements, > etc) and if the original author isn't able to add it in themselves, it > could be perltidy'd in order to help the developer (locate and fix the > issue)|(add relevant enhancement where needed). Don't get me wrong--I think whatever makes bioperl a better, more maintainable beast should be what is done. The bioperl gurus should absolutely do what is best for them for code maintainability. Sean From n.haigh at sheffield.ac.uk Fri Jun 15 15:17:15 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 15 Jun 2007 16:17:15 +0100 Subject: [Bioperl-l] Perltidy and... SVN and ...Re: Perltidy In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov> <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu> Message-ID: <4672AD7B.4050109@sheffield.ac.uk> Chris Fields wrote: > On Jun 15, 2007, at 5:56 AM, Sean Davis wrote: > >> Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing >>> I'm not sure its worth it. There are more pressing things to be >>> done for >>> Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd >>> do it. >>> If that's not appropriate, I won't. >> I agree with the sentiment noted above. I'm a bit of an outsider >> here, >> but bioperl is a collaborative project. Not everyone has the same >> sentiments about what "correct" style means. As a programmer, I >> really >> wouldn't want significant changes on the style of my code. And perl >> happily puts up with many styles. I would say leave things as they >> are--let the individual programmers choose. It reduces the amount of >> work of questionable importance and allows the coding style freedom >> that >> perl supports. >> >> Just my $.02. >> >> Sean > > I tend to run it on modules that need some reformatting > (SearchIO::blast comes to mind). I believe you're correct when this > comes down to programming style, but I think this echoes a sentiment > (frustration, perhaps) that some of us have with long-term > maintenance of said code. > > Maybe a compromise: include a copy of .perltidyrc with the > distribution that goes by what a consensus wants or by the general > rules laid out in Perl Best Practices (spaced settings, use of spaces > over tabs, etc). RE spaces, tabs etc - how well is the different coding styles handled for displaying in html and via the online browsable cvs? Conversion would be encouraged but voluntary, with > the caveat that if someone needs to clean up code down the road (bug > fixes, enhancements, etc) and if the original author isn't able to > add it in themselves, it could be perltidy'd in order to help the > developer (locate and fix the issue)|(add relevant enhancement where > needed). > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnsonm at gmail.com Fri Jun 15 19:37:26 2007 From: johnsonm at gmail.com (Mark Johnson) Date: Fri, 15 Jun 2007 14:37:26 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: Patches waiting in Bugzilla (Bug #2299). Changes: -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for prokaryotic reports (Glimmer2/Glimmer3) -Bio::Tools::Glimmer now produces features with Fuzzy or Split locations as appropriate (partial or circular/wraparound predictions) -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out sequence lengths -Bio::Tools::Run::Glimmer passes along the sequence length to Bio::Tools::Glimmer for Glimmer2 I should probably modify Bio::Tools::Genemark to use Bio::SeqFeature::Generic features for prokaryotic reports, to be consistent, but this is more likely to surprise people. If nobody screams about the change to Bio::Tools::Glimmer, I'll do it at some point. On 5/21/07, Chris Fields wrote: > > On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: > > >> glimmer2/3 both assume the genome is circular by default (I'm > >> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to > >> the Glimmer3 release notes the detail file has the information in the > >> header; from the Glimmer3 data used for tests: > > > > You beat me to the reply Chris - yes, Glimmer2/3 assume circular > > chromosome by default. I had forgotten about this in earlier > > discussions of the new Glimmer parsers as I normally run it in > > --linear / -L mode (even if I know it is circular) because it is > > easier to handle, and our sequencer/assembler team usually gets the > > origin of replication right. > > > >> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA > >> Glimmer3.icm Glimmer3 > > > > I did a double-take here - that's the path to my Glimmer3 > > installation! It took me a couple of minutes to realise that you got > > it from the bioperl test data I created. D'oh! :-) > > Yep, I forgot about that! > > >> There are options available for glimmer3 (-L, -X) that specify a > >> linear sequence or allow ORFs to extend past the end of the sequence > >> analyzed (the latter assumes a linear sequence). > > > > If the -L mode should produce Bio::Location::Split objects, I guess if > > -X is used > > it should produce Bio::Location::Fuzzy objects too... > > > > --Torsten > > True, didn't think about that one. Def. something to consider adding > in. > > chris > > > From cjfields at uiuc.edu Fri Jun 15 20:55:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 15:55:06 -0500 Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates?? In-Reply-To: References: <79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu> <62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu> <9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu> Message-ID: I'll try getting to that in tonight. Been pretty tied up lately... chris On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote: > Patches waiting in Bugzilla (Bug #2299). Changes: > > -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for > prokaryotic reports (Glimmer2/Glimmer3) > -Bio::Tools::Glimmer now produces features with Fuzzy or Split > locations as appropriate (partial or circular/wraparound predictions) > -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out > sequence lengths > -Bio::Tools::Run::Glimmer passes along the sequence length to > Bio::Tools::Glimmer for Glimmer2 > > I should probably modify Bio::Tools::Genemark to use > Bio::SeqFeature::Generic features for prokaryotic reports, to be > consistent, but this is more likely to surprise people. If nobody > screams about the change to Bio::Tools::Glimmer, I'll do it at some > point. > > On 5/21/07, Chris Fields wrote: >> >> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote: >> >>>> glimmer2/3 both assume the genome is circular by default (I'm >>>> assuming since Glimmer2/3 are used for bacterial genomes). Acc. to >>>> the Glimmer3 release notes the detail file has the information >>>> in the >>>> header; from the Glimmer3 data used for tests: >>> >>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular >>> chromosome by default. I had forgotten about this in earlier >>> discussions of the new Glimmer parsers as I normally run it in >>> --linear / -L mode (even if I know it is circular) because it is >>> easier to handle, and our sequencer/assembler team usually gets the >>> origin of replication right. >>> >>>> Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ >>>> BCTDNA >>>> Glimmer3.icm Glimmer3 >>> >>> I did a double-take here - that's the path to my Glimmer3 >>> installation! It took me a couple of minutes to realise that you got >>> it from the bioperl test data I created. D'oh! :-) >> >> Yep, I forgot about that! >> >>>> There are options available for glimmer3 (-L, -X) that specify a >>>> linear sequence or allow ORFs to extend past the end of the >>>> sequence >>>> analyzed (the latter assumes a linear sequence). >>> >>> If the -L mode should produce Bio::Location::Split objects, I >>> guess if >>> -X is used >>> it should produce Bio::Location::Fuzzy objects too... >>> >>> --Torsten >> >> True, didn't think about that one. Def. something to consider adding >> in. >> >> chris >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rvos at interchange.ubc.ca Fri Jun 15 21:08:17 2007 From: rvos at interchange.ubc.ca (rvos) Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Hi, I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. Rutger -----Original Message----- > Date: Fri Jun 15 07:56:23 PDT 2007 > From: "Chris Fields" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sendu Bala" > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > >>>> ... > >>> Can we do any sort of massive conversion at some logical timepoint. > >>> Probably after a branch release or something? Because it basically > >>> means we're going to have differences on nearly every line which is > >>> going to make diff-ing difficult when debugging old/new versions. > >>> Maybe it is not a problem because we aren't introducing and new > >>> bugs! > > > > Sorry, can you clarify the problem you envisage? And why would > > making a branch release help? > > Maybe the worry is that mass conversion in such a large codebase > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > trying? > > >> I agree; if we intend on doing this it should be all at once, > >> maybe on a branch dedicated to ensure that code changes don't > >> tank tests (they shouldn't but one never knows). We would then > >> need a script up- and-running that tidies everything up prior to > >> commits (though what happens if perltidy tanks?...). > >> Sendu, up for it? > > > > If its going to be difficult and a hassle, for such an unnecessary > > thing I'm not sure its worth it. There are more pressing things to > > be done for Bioperl. > > > > If I can just run perltidy on the entire package and commit, I'd do > > it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? See my response to Sean. > > >>>> About svn > > [snip] > >> Stepped into that one, didn't I! I'll look into how much effort > >> is involved and try getting something going in the next month or > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >> well but it might be worth looking into. > > > > I'd put this in the unnecessary-but-nice category as well. If it > > will be as easy as my ->new change, go ahead. If not, there are > > more pressing matters (POD fixing, test script updating and > > finishing...). > > A few other open-bio projects have actively discussed a CVS->SVN > migration (BioRuby and I think BioPython, though the latter could be > wrong). As I said, "it might be worth looking into" to weigh the > pros/cons, get others opinions from others who have made the > transition, etc. We could, as Jason suggested, even set up a tester > SVN w/o making it the default codebase (lock it off to a few testers, > have CVS commits automatically/manually carry over to SVN, etc). > > I agree with you that it's not feasible to switch over prior to a > release and that there are more pressing issues, but it doesn't hurt > having an open discussion about it. > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From spiros at lokku.com Fri Jun 15 21:40:32 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Fri, 15 Jun 2007 22:40:32 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: On 6/15/07, rvos wrote: > Hi, > > I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS. > > Rutger > I second that, SVN seems like the reasonable choice. I would be more than happy to help out as well. Spiros > > -----Original Message----- > > > Date: Fri Jun 15 07:56:23 PDT 2007 > > From: "Chris Fields" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sendu Bala" > > > > > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > > > > >>>> ... > > >>> Can we do any sort of massive conversion at some logical timepoint. > > >>> Probably after a branch release or something? Because it basically > > >>> means we're going to have differences on nearly every line which is > > >>> going to make diff-ing difficult when debugging old/new versions. > > >>> Maybe it is not a problem because we aren't introducing and new > > >>> bugs! > > > > > > Sorry, can you clarify the problem you envisage? And why would > > > making a branch release help? > > > > Maybe the worry is that mass conversion in such a large codebase > > could lead to hard-to-locate bugs. Shouldn't occur but who knows w/o > > trying? > > > > >> I agree; if we intend on doing this it should be all at once, > > >> maybe on a branch dedicated to ensure that code changes don't > > >> tank tests (they shouldn't but one never knows). We would then > > >> need a script up- and-running that tidies everything up prior to > > >> commits (though what happens if perltidy tanks?...). > > >> Sendu, up for it? > > > > > > If its going to be difficult and a hassle, for such an unnecessary > > > thing I'm not sure its worth it. There are more pressing things to > > > be done for Bioperl. > > > > > > If I can just run perltidy on the entire package and commit, I'd do > > > it. If that's not appropriate, I won't. > > > > The choices aren't necessarily all or nothing. What about voluntary, > > recommended use of a perltidy config file included with the > > distribution, with additional 'caveats'? See my response to Sean. > > > > >>>> About svn > > > [snip] > > >> Stepped into that one, didn't I! I'll look into how much effort > > >> is involved and try getting something going in the next month or > > >> two, maybe sooner if time permits. I'm lacking on SVN-foo as > > >> well but it might be worth looking into. > > > > > > I'd put this in the unnecessary-but-nice category as well. If it > > > will be as easy as my ->new change, go ahead. If not, there are > > > more pressing matters (POD fixing, test script updating and > > > finishing...). > > > > A few other open-bio projects have actively discussed a CVS->SVN > > migration (BioRuby and I think BioPython, though the latter could be > > wrong). As I said, "it might be worth looking into" to weigh the > > pros/cons, get others opinions from others who have made the > > transition, etc. We could, as Jason suggested, even set up a tester > > SVN w/o making it the default codebase (lock it off to a few testers, > > have CVS commits automatically/manually carry over to SVN, etc). > > > > I agree with you that it's not feasible to switch over prior to a > > release and that there are more pressing issues, but it doesn't hurt > > having an open discussion about it. > > > > chris > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Fri Jun 15 22:10:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 18:10:25 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> So should we set up a sandbox svn repository and those who would like to help out - take shots at migrating bioperl (any current cvs snapshot will do) to svn - you document what you find yourself having to do in trying to make it work - you report back when you think you have a working repository - we all get a defined amount of time to test to our hearts' content, say 2 weeks - you fix issues that were encountered - report back when done, followed by retesting for, say 1 week - iterate previous 2 steps until no issues and no objections to migration - two more weeks of warning period to all developers to commit all outstanding changes, or reapply them to a future svn checkout - pull the trigger by locking down cvs, applying the migration as worked out before, and announcing that BioPerl is now on svn - get free beer at next BOSC (I'll pay if no one else does) This may not be precisely the plan that needs to be executed, but it's probably somewhere along those lines. If there are volunteers who would like to spearhead this, then power to you - I think everyone is in favor and the advantages of svn don't need to be debated. The only reason it hasn't happened yet is because no one has stepped forward who would have the energy. I'm sure ChrisD will gladly create the svn sandbox if we have volunteers lined up to get going. -hilmar On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > On 6/15/07, rvos wrote: >> Hi, >> >> I would very much prefer it if bioperl moved to svn. I'm >> considering merging Bio::Phylo (to the extent that that's possible/ >> practical) with bioperl and move it to an OBF repository, but I'd >> rather not go back to CVS. >> >> Rutger >> > > I second that, SVN seems like the reasonable choice. I would be more > than happy to help out as well. > > Spiros > >> >> -----Original Message----- >> >>> Date: Fri Jun 15 07:56:23 PDT 2007 >>> From: "Chris Fields" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sendu Bala" >>> >>> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> >>>>>>> ... >>>>>> Can we do any sort of massive conversion at some logical >>>>>> timepoint. >>>>>> Probably after a branch release or something? Because it >>>>>> basically >>>>>> means we're going to have differences on nearly every line >>>>>> which is >>>>>> going to make diff-ing difficult when debugging old/new versions. >>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>> bugs! >>>> >>>> Sorry, can you clarify the problem you envisage? And why would >>>> making a branch release help? >>> >>> Maybe the worry is that mass conversion in such a large codebase >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>> w/o >>> trying? >>> >>>>> I agree; if we intend on doing this it should be all at once, >>>>> maybe on a branch dedicated to ensure that code changes don't >>>>> tank tests (they shouldn't but one never knows). We would then >>>>> need a script up- and-running that tidies everything up prior to >>>>> commits (though what happens if perltidy tanks?...). >>>>> Sendu, up for it? >>>> >>>> If its going to be difficult and a hassle, for such an unnecessary >>>> thing I'm not sure its worth it. There are more pressing things to >>>> be done for Bioperl. >>>> >>>> If I can just run perltidy on the entire package and commit, I'd do >>>> it. If that's not appropriate, I won't. >>> >>> The choices aren't necessarily all or nothing. What about >>> voluntary, >>> recommended use of a perltidy config file included with the >>> distribution, with additional 'caveats'? See my response to Sean. >>> >>>>>>> About svn >>>> [snip] >>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>> is involved and try getting something going in the next month or >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>> well but it might be worth looking into. >>>> >>>> I'd put this in the unnecessary-but-nice category as well. If it >>>> will be as easy as my ->new change, go ahead. If not, there are >>>> more pressing matters (POD fixing, test script updating and >>>> finishing...). >>> >>> A few other open-bio projects have actively discussed a CVS->SVN >>> migration (BioRuby and I think BioPython, though the latter could be >>> wrong). As I said, "it might be worth looking into" to weigh the >>> pros/cons, get others opinions from others who have made the >>> transition, etc. We could, as Jason suggested, even set up a tester >>> SVN w/o making it the default codebase (lock it off to a few >>> testers, >>> have CVS commits automatically/manually carry over to SVN, etc). >>> >>> I agree with you that it's not feasible to switch over prior to a >>> release and that there are more pressing issues, but it doesn't hurt >>> having an open discussion about it. >>> >>> chris >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jason at bioperl.org Fri Jun 15 22:23:15 2007 From: jason at bioperl.org (Jason Stajich) Date: Fri, 15 Jun 2007 15:23:15 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: Sounds like a plan, I'll be curious to see if we can still get keep anonymous CVS working as I'd like to not have to pull the plug on that. There are some threads out on the web about how to do this with a commit rule on SVN. Also, can someone who is close enough to all the SVN benefits please elaborate how it is going to help _this_ project? Perhaps you would be willing to put a few words up -- like on (a to be created): http://bioperl.org/wiki/BioPerl:Version_control_changeover This way if anonymous CVS is broken and/or developers who haven't been paying attention come back to commit code ask why things changed we don't have to compose long emails... =) -jason On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From sheris at eps.berkeley.edu Fri Jun 15 22:58:12 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 15:58:12 -0700 Subject: [Bioperl-l] seq doesn't validate error Message-ID: <200706151558.12911.sheris@eps.berkeley.edu> Hi, I'm getting an error as follows when I try to reverse complement a sequence string stored in a hash of arrays. The storage code is: $nstarthash{$key} = [$sortchecks[0], join("", @nseq), join("",@{$seqhash{$key}})]; the sequence of interest is the element at index 1. Later, I try to retrieve this string for a subset of keys so I can reverse complement it based on input from another hash (%complement): my %revcomphash = map { my $read = $_; grep $complement{$read} eq 'C', %complement; {$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};} keys(%nstarthash); I get the following warning (long sequence edited for clarity): -- -------------------- WARNING --------------------- MSG: seq doesn't validate, mismatch is 1 --------------------------------------------------- ------------- EXCEPTION ------------- MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] which does not look healthy STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK toplevel ../quality_wrapper.pl:103 I cannot find any non-allowed characters in the sequence, and the de-referencing appears to work correctly. Can anyone help me? I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a Mepis 6.5 system. Thanks Sheri --------------------------------------------------------------------- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From Kevin.M.Brown at asu.edu Fri Jun 15 23:11:34 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Fri, 15 Jun 2007 16:11:34 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> > I'm getting an error as follows when I try to reverse > complement a sequence string stored in a hash of arrays. The > storage code is: > > $nstarthash{$key} = [$sortchecks[0], join("", > @nseq), > join("",@{$seqhash{$key}})]; > > the sequence of interest is the element at index 1. > > Later, I try to retrieve this string for a subset of keys so > I can reverse complement it based on input from another hash > (%complement): > > my %revcomphash = map { my $read = $_; > grep $complement{$read} eq 'C', %complement; > {$_, (Bio::Seq->new(-seq > =>$nstarthash{$_}[1]))->revcom->seq()};} > keys(%nstarthash); > > > I get the following warning (long sequence edited for clarity): > > -- -------------------- WARNING --------------------- > MSG: seq doesn't validate, mismatch is 1 > --------------------------------------------------- > > ------------- EXCEPTION ------------- > MSG: Attempting to set the sequence to > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > which does not look healthy > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > toplevel ../quality_wrapper.pl:103 > > I cannot find any non-allowed characters in the sequence, and > the de-referencing appears to work correctly. Can anyone help me? > I'm using the latest Bioperl installation (1.5.2) with > ActivePerl5.8 on a Mepis 6.5 system. Try telling the Bio::Seq object what alphabet to use when creating it. I tend to create them like: Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') From sheris at eps.berkeley.edu Fri Jun 15 23:53:04 2007 From: sheris at eps.berkeley.edu (Sheri Simmons) Date: Fri, 15 Jun 2007 16:53:04 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> Message-ID: <200706151653.04135.sheris@eps.berkeley.edu> Thanks for the suggestion, but that still gives the same error as before. On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: > > I'm getting an error as follows when I try to reverse > > complement a sequence string stored in a hash of arrays. The > > storage code is: > > > > $nstarthash{$key} = [$sortchecks[0], join("", > > @nseq), > > join("",@{$seqhash{$key}})]; > > > > the sequence of interest is the element at index 1. > > > > Later, I try to retrieve this string for a subset of keys so > > I can reverse complement it based on input from another hash > > (%complement): > > > > my %revcomphash = map { my $read = $_; > > grep $complement{$read} eq 'C', %complement; > > {$_, (Bio::Seq->new(-seq > > =>$nstarthash{$_}[1]))->revcom->seq()};} > > keys(%nstarthash); > > > > > > I get the following warning (long sequence edited for clarity): > > > > -- -------------------- WARNING --------------------- > > MSG: seq doesn't validate, mismatch is 1 > > --------------------------------------------------- > > > > ------------- EXCEPTION ------------- > > MSG: Attempting to set the sequence to > > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] > > which does not look healthy > > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 > > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 > > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK > > toplevel ../quality_wrapper.pl:103 > > > > I cannot find any non-allowed characters in the sequence, and > > the de-referencing appears to work correctly. Can anyone help me? > > I'm using the latest Bioperl installation (1.5.2) with > > ActivePerl5.8 on a Mepis 6.5 system. > > Try telling the Bio::Seq object what alphabet to use when creating it. > I tend to create them like: > > Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') -- Sheri Simmons Department of Earth and Planetary Sciences University of California, Berkeley Berkeley, CA 94720-4767 From hlapp at gmx.net Sat Jun 16 01:27:42 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 21:27:42 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: Could you post a ticket to the helpdesk: support at open-bio.org. -hilmar On Jun 15, 2007, at 9:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Sat Jun 16 01:08:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 15 Jun 2007 21:08:32 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <18035.14352.963113.473274@almost.alerce.com> Hilmar Lapp writes: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn Free Beer, huh? Do you deliver? Can you package up a tarball of the cvs repository (bzip or gzip would save some time) itself? thanks! g. From cjfields at uiuc.edu Sat Jun 16 01:42:05 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:42:05 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> The browsable CVS has a 'Download tarball' link if that helps. http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? cvsroot=bioperl chris On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > Hilmar Lapp writes: >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. From cjfields at uiuc.edu Sat Jun 16 01:50:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 20:50:09 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> I'll help out to the extent I can w/o having the SVN know-how. We need (as Jason points out) someone who can detail the benefits and maybe keep an updated journal on the wiki. I believe at least one or two of the other Bio* contemplated moving over to SVN, which may be worth checking out. chris On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote: > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > >> On 6/15/07, rvos wrote: >>> Hi, >>> >>> I would very much prefer it if bioperl moved to svn. I'm >>> considering merging Bio::Phylo (to the extent that that's possible/ >>> practical) with bioperl and move it to an OBF repository, but I'd >>> rather not go back to CVS. >>> >>> Rutger >>> >> >> I second that, SVN seems like the reasonable choice. I would be more >> than happy to help out as well. >> >> Spiros >> >>> >>> -----Original Message----- >>> >>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>> From: "Chris Fields" >>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>> To: "Sendu Bala" >>>> >>>> >>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>> >>>>>>>> ... >>>>>>> Can we do any sort of massive conversion at some logical >>>>>>> timepoint. >>>>>>> Probably after a branch release or something? Because it >>>>>>> basically >>>>>>> means we're going to have differences on nearly every line >>>>>>> which is >>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>> versions. >>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>> bugs! >>>>> >>>>> Sorry, can you clarify the problem you envisage? And why would >>>>> making a branch release help? >>>> >>>> Maybe the worry is that mass conversion in such a large codebase >>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>> w/o >>>> trying? >>>> >>>>>> I agree; if we intend on doing this it should be all at once, >>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>> need a script up- and-running that tidies everything up prior to >>>>>> commits (though what happens if perltidy tanks?...). >>>>>> Sendu, up for it? >>>>> >>>>> If its going to be difficult and a hassle, for such an unnecessary >>>>> thing I'm not sure its worth it. There are more pressing things to >>>>> be done for Bioperl. >>>>> >>>>> If I can just run perltidy on the entire package and commit, >>>>> I'd do >>>>> it. If that's not appropriate, I won't. >>>> >>>> The choices aren't necessarily all or nothing. What about >>>> voluntary, >>>> recommended use of a perltidy config file included with the >>>> distribution, with additional 'caveats'? See my response to Sean. >>>> >>>>>>>> About svn >>>>> [snip] >>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>> is involved and try getting something going in the next month or >>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>> well but it might be worth looking into. >>>>> >>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>> more pressing matters (POD fixing, test script updating and >>>>> finishing...). >>>> >>>> A few other open-bio projects have actively discussed a CVS->SVN >>>> migration (BioRuby and I think BioPython, though the latter >>>> could be >>>> wrong). As I said, "it might be worth looking into" to weigh the >>>> pros/cons, get others opinions from others who have made the >>>> transition, etc. We could, as Jason suggested, even set up a >>>> tester >>>> SVN w/o making it the default codebase (lock it off to a few >>>> testers, >>>> have CVS commits automatically/manually carry over to SVN, etc). >>>> >>>> I agree with you that it's not feasible to switch over prior to a >>>> release and that there are more pressing issues, but it doesn't >>>> hurt >>>> having an open discussion about it. >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Sat Jun 16 02:12:55 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 15 Jun 2007 22:12:55 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> I think he meant the cvs repository itself, containing all the change data. -hilmar On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > The browsable CVS has a 'Download tarball' link if that helps. > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? > cvsroot=bioperl > > chris > > On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: > >> Hilmar Lapp writes: >>> So should we set up a sandbox svn repository and those who would >>> like >>> to help out >>> >>> - take shots at migrating bioperl (any current cvs snapshot will do) >>> to svn >> >> Free Beer, huh? Do you deliver? >> >> Can you package up a tarball of the cvs repository (bzip or gzip >> would >> save some time) itself? >> >> thanks! >> >> g. > > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Sat Jun 16 02:37:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 15 Jun 2007 21:37:55 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: Ah, got it. Sorry. George, planning on taking this up? chris On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote: > I think he meant the cvs repository itself, containing all the > change data. -hilmar > > On Jun 15, 2007, at 9:42 PM, Chris Fields wrote: > >> The browsable CVS has a 'Download tarball' link if that helps. >> >> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? >> cvsroot=bioperl >> >> chris >> >> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote: >> >>> Hilmar Lapp writes: >>>> So should we set up a sandbox svn repository and those who would >>>> like >>>> to help out >>>> >>>> - take shots at migrating bioperl (any current cvs snapshot will >>>> do) >>>> to svn >>> >>> Free Beer, huh? Do you deliver? >>> >>> Can you package up a tarball of the cvs repository (bzip or gzip >>> would >>> save some time) itself? >>> >>> thanks! >>> >>> g. >> >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sat Jun 16 08:20:57 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 16 Jun 2007 09:20:57 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18035.14352.963113.473274@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> Message-ID: <46739D69.4090204@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Hilmar Lapp writes: > > So should we set up a sandbox svn repository and those who would like > > to help out > > > > - take shots at migrating bioperl (any current cvs snapshot will do) > > to svn > > Free Beer, huh? Do you deliver? > > Can you package up a tarball of the cvs repository (bzip or gzip would > save some time) itself? > > thanks! > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Sounds like George might know what he's doing! I have a question about setting up svn access. I believe access can be done in several ways, over webdav, over ssh and probably others too. Do you have any knowledge about the benefits of one over the other? I suppose I'm thinking of what to implement to allow anonymous read access for users and authenticated access for developers. Nath p.s. if you need any monkeys to do some work I'm happy to help out as much as possible. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9 Lb9NUEe4dkCakQ+Gc7Py98A= =BG9m -----END PGP SIGNATURE----- From rvos at interchange.ubc.ca Sat Jun 16 10:37:11 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca> I can volunteer some time to help out with this. Rutger -----Original Message----- > Date: Fri Jun 15 15:10:25 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: spiros at lokku.com > > So should we set up a sandbox svn repository and those who would like > to help out > > - take shots at migrating bioperl (any current cvs snapshot will do) > to svn > > - you document what you find yourself having to do in trying to make > it work > > - you report back when you think you have a working repository > > - we all get a defined amount of time to test to our hearts' content, > say 2 weeks > > - you fix issues that were encountered > > - report back when done, followed by retesting for, say 1 week > > - iterate previous 2 steps until no issues and no objections to > migration > > - two more weeks of warning period to all developers to commit all > outstanding changes, or reapply them to a future svn checkout > > - pull the trigger by locking down cvs, applying the migration as > worked out before, and announcing that BioPerl is now on svn > > - get free beer at next BOSC (I'll pay if no one else does) > > This may not be precisely the plan that needs to be executed, but > it's probably somewhere along those lines. > > If there are volunteers who would like to spearhead this, then power > to you - I think everyone is in favor and the advantages of svn don't > need to be debated. The only reason it hasn't happened yet is because > no one has stepped forward who would have the energy. > > I'm sure ChrisD will gladly create the svn sandbox if we have > volunteers lined up to get going. > > -hilmar > > On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: > > > On 6/15/07, rvos wrote: > >> Hi, > >> > >> I would very much prefer it if bioperl moved to svn. I'm > >> considering merging Bio::Phylo (to the extent that that's possible/ > >> practical) with bioperl and move it to an OBF repository, but I'd > >> rather not go back to CVS. > >> > >> Rutger > >> > > > > I second that, SVN seems like the reasonable choice. I would be more > > than happy to help out as well. > > > > Spiros > > > >> > >> -----Original Message----- > >> > >>> Date: Fri Jun 15 07:56:23 PDT 2007 > >>> From: "Chris Fields" > >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > >>> To: "Sendu Bala" > >>> > >>> > >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: > >>> > >>>>>>> ... > >>>>>> Can we do any sort of massive conversion at some logical > >>>>>> timepoint. > >>>>>> Probably after a branch release or something? Because it > >>>>>> basically > >>>>>> means we're going to have differences on nearly every line > >>>>>> which is > >>>>>> going to make diff-ing difficult when debugging old/new versions. > >>>>>> Maybe it is not a problem because we aren't introducing and new > >>>>>> bugs! > >>>> > >>>> Sorry, can you clarify the problem you envisage? And why would > >>>> making a branch release help? > >>> > >>> Maybe the worry is that mass conversion in such a large codebase > >>> could lead to hard-to-locate bugs. Shouldn't occur but who knows > >>> w/o > >>> trying? > >>> > >>>>> I agree; if we intend on doing this it should be all at once, > >>>>> maybe on a branch dedicated to ensure that code changes don't > >>>>> tank tests (they shouldn't but one never knows). We would then > >>>>> need a script up- and-running that tidies everything up prior to > >>>>> commits (though what happens if perltidy tanks?...). > >>>>> Sendu, up for it? > >>>> > >>>> If its going to be difficult and a hassle, for such an unnecessary > >>>> thing I'm not sure its worth it. There are more pressing things to > >>>> be done for Bioperl. > >>>> > >>>> If I can just run perltidy on the entire package and commit, I'd do > >>>> it. If that's not appropriate, I won't. > >>> > >>> The choices aren't necessarily all or nothing. What about > >>> voluntary, > >>> recommended use of a perltidy config file included with the > >>> distribution, with additional 'caveats'? See my response to Sean. > >>> > >>>>>>> About svn > >>>> [snip] > >>>>> Stepped into that one, didn't I! I'll look into how much effort > >>>>> is involved and try getting something going in the next month or > >>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as > >>>>> well but it might be worth looking into. > >>>> > >>>> I'd put this in the unnecessary-but-nice category as well. If it > >>>> will be as easy as my ->new change, go ahead. If not, there are > >>>> more pressing matters (POD fixing, test script updating and > >>>> finishing...). > >>> > >>> A few other open-bio projects have actively discussed a CVS->SVN > >>> migration (BioRuby and I think BioPython, though the latter could be > >>> wrong). As I said, "it might be worth looking into" to weigh the > >>> pros/cons, get others opinions from others who have made the > >>> transition, etc. We could, as Jason suggested, even set up a tester > >>> SVN w/o making it the default codebase (lock it off to a few > >>> testers, > >>> have CVS commits automatically/manually carry over to SVN, etc). > >>> > >>> I agree with you that it's not feasible to switch over prior to a > >>> release and that there are more pressing issues, but it doesn't hurt > >>> having an open discussion about it. > >>> > >>> chris > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sdavis2 at mail.nih.gov Sat Jun 16 11:21:47 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Sat, 16 Jun 2007 07:21:47 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> Message-ID: <4673C7CB.1030709@mail.nih.gov> Chris Fields wrote: > I'll help out to the extent I can w/o having the SVN know-how. We > need (as Jason points out) someone who can detail the benefits and > maybe keep an updated journal on the wiki. > > I believe at least one or two of the other Bio* contemplated moving > over to SVN, which may be worth checking out. > The bioconductor project is on SVN. The project includes over 200 packages (the equivalent of perl modules) with something around 150-200 ACTIVE developers. They also have a build system for several OSes that operates on a cron-like system with builds of several versions approximately daily. Their system is running at something like revision 30,000, so they have significant experience. If anyone would like technical support, I can certainly ask the folks maintaining their site if they can give some input. Let me know if anyone would like a contact person. As for access, the typical access is over http (or https). Access controls can be set up on the server side while allowing anonymous access for checkout. There are many excellent SVN for every OS, so that should not be a problem. Sean From cjfields at uiuc.edu Sat Jun 16 14:02:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 09:02:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> On Jun 16, 2007, at 6:21 AM, Sean Davis wrote: > Chris Fields wrote: >> I'll help out to the extent I can w/o having the SVN know-how. We >> need (as Jason points out) someone who can detail the benefits and >> maybe keep an updated journal on the wiki. >> >> I believe at least one or two of the other Bio* contemplated moving >> over to SVN, which may be worth checking out. >> > The bioconductor project is on SVN. The project includes over 200 > packages (the equivalent of perl modules) with something around > 150-200 > ACTIVE developers. They also have a build system for several OSes > that > operates on a cron-like system with builds of several versions > approximately daily. Their system is running at something like > revision > 30,000, so they have significant experience. If anyone would like > technical support, I can certainly ask the folks maintaining their > site > if they can give some input. Let me know if anyone would like a > contact > person. > > As for access, the typical access is over http (or https). Access > controls can be set up on the server side while allowing anonymous > access for checkout. There are many excellent SVN for every OS, so > that > should not be a problem. > > Sean It looks like George Hartzell may be taking a crack at it, with Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we could have something testable relatively soon. After that we'll need to work out a few other issues, basically what's on Hilmar's list. chris From hlapp at gmx.net Sat Jun 16 14:40:08 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:40:08 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net> Just as an aside, even if we can't keep anonymous cvs working, I would think that using apache URL rewriting and a small CGI script that returns an appropriate page redirect we can without too much trouble keep the hyperlinks functional that people may have bookmarked -hilmar On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote: > Sounds like a plan, I'll be curious to see if we can still get keep > anonymous CVS working as I'd like to not have to pull the plug on > that. There are some threads out on the web about how to do this > with a commit rule on SVN. > > Also, can someone who is close enough to all the SVN benefits > please elaborate how it is going to help _this_ project? > Perhaps you would be willing to put a few words up -- like on (a to > be created): > http://bioperl.org/wiki/BioPerl:Version_control_changeover > > This way if anonymous CVS is broken and/or developers who haven't > been paying attention come back to commit code ask why things > changed we don't have to compose long emails... =) > > -jason > On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote: > >> So should we set up a sandbox svn repository and those who would like >> to help out >> >> - take shots at migrating bioperl (any current cvs snapshot will do) >> to svn >> >> - you document what you find yourself having to do in trying to make >> it work >> >> - you report back when you think you have a working repository >> >> - we all get a defined amount of time to test to our hearts' content, >> say 2 weeks >> >> - you fix issues that were encountered >> >> - report back when done, followed by retesting for, say 1 week >> >> - iterate previous 2 steps until no issues and no objections to >> migration >> >> - two more weeks of warning period to all developers to commit all >> outstanding changes, or reapply them to a future svn checkout >> >> - pull the trigger by locking down cvs, applying the migration as >> worked out before, and announcing that BioPerl is now on svn >> >> - get free beer at next BOSC (I'll pay if no one else does) >> >> This may not be precisely the plan that needs to be executed, but >> it's probably somewhere along those lines. >> >> If there are volunteers who would like to spearhead this, then power >> to you - I think everyone is in favor and the advantages of svn don't >> need to be debated. The only reason it hasn't happened yet is because >> no one has stepped forward who would have the energy. > >> >> I'm sure ChrisD will gladly create the svn sandbox if we have >> volunteers lined up to get going. >> >> -hilmar >> >> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote: >> >>> On 6/15/07, rvos wrote: >>>> Hi, >>>> >>>> I would very much prefer it if bioperl moved to svn. I'm >>>> considering merging Bio::Phylo (to the extent that that's possible/ >>>> practical) with bioperl and move it to an OBF repository, but I'd >>>> rather not go back to CVS. >>>> >>>> Rutger >>>> >>> >>> I second that, SVN seems like the reasonable choice. I would be more >>> than happy to help out as well. >>> >>> Spiros >>> >>>> >>>> -----Original Message----- >>>> >>>>> Date: Fri Jun 15 07:56:23 PDT 2007 >>>>> From: "Chris Fields" >>>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>>>> To: "Sendu Bala" >>>>> >>>>> >>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>>>> >>>>>>>>> ... >>>>>>>> Can we do any sort of massive conversion at some logical >>>>>>>> timepoint. >>>>>>>> Probably after a branch release or something? Because it >>>>>>>> basically >>>>>>>> means we're going to have differences on nearly every line >>>>>>>> which is >>>>>>>> going to make diff-ing difficult when debugging old/new >>>>>>>> versions. >>>>>>>> Maybe it is not a problem because we aren't introducing and new >>>>>>>> bugs! >>>>>> >>>>>> Sorry, can you clarify the problem you envisage? And why would >>>>>> making a branch release help? >>>>> >>>>> Maybe the worry is that mass conversion in such a large codebase >>>>> could lead to hard-to-locate bugs. Shouldn't occur but who knows >>>>> w/o >>>>> trying? >>>>> >>>>>>> I agree; if we intend on doing this it should be all at once, >>>>>>> maybe on a branch dedicated to ensure that code changes don't >>>>>>> tank tests (they shouldn't but one never knows). We would then >>>>>>> need a script up- and-running that tidies everything up prior to >>>>>>> commits (though what happens if perltidy tanks?...). >>>>>>> Sendu, up for it? >>>>>> >>>>>> If its going to be difficult and a hassle, for such an >>>>>> unnecessary >>>>>> thing I'm not sure its worth it. There are more pressing >>>>>> things to >>>>>> be done for Bioperl. >>>>>> >>>>>> If I can just run perltidy on the entire package and commit, >>>>>> I'd do >>>>>> it. If that's not appropriate, I won't. >>>>> >>>>> The choices aren't necessarily all or nothing. What about >>>>> voluntary, >>>>> recommended use of a perltidy config file included with the >>>>> distribution, with additional 'caveats'? See my response to Sean. >>>>> >>>>>>>>> About svn >>>>>> [snip] >>>>>>> Stepped into that one, didn't I! I'll look into how much effort >>>>>>> is involved and try getting something going in the next >>>>>>> month or >>>>>>> two, maybe sooner if time permits. I'm lacking on SVN-foo as >>>>>>> well but it might be worth looking into. >>>>>> >>>>>> I'd put this in the unnecessary-but-nice category as well. If it >>>>>> will be as easy as my ->new change, go ahead. If not, there are >>>>>> more pressing matters (POD fixing, test script updating and >>>>>> finishing...). >>>>> >>>>> A few other open-bio projects have actively discussed a CVS->SVN >>>>> migration (BioRuby and I think BioPython, though the latter >>>>> could be >>>>> wrong). As I said, "it might be worth looking into" to weigh the >>>>> pros/cons, get others opinions >from others who have made the >>>>> transition, etc. We could, as Jason suggested, even set up a >>>>> tester >>>>> SVN w/o making it the default codebase (lock it off to a few >>>>> testers, >>>>> have CVS commits automatically/manually carry over to SVN, etc). >>>>> >>>>> I agree with you that it's not feasible to switch over prior to a >>>>> release and that there are more pressing issues, but it doesn't >>>>> hurt >>>>> having an open discussion about it. >>>>> >>>>> chris >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Jun 16 14:55:09 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 16 Jun 2007 10:55:09 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4673C7CB.1030709@mail.nih.gov> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > As for access, the typical access is over http (or https). We're using svn+ssh here (NESCent) so the password is the same as the one you set for your account on the server, and you can use public/ private key negotiation for authentication. I think the ability to not provide a password for every single interaction is a requirement. If that requires using svn+ssh or can be made to work through https too I don't know. On sf.net I have to use https for svn and it doesn't ask me for the password each time. Not sure how this works though, maybe some local caching? We should not be using http, or whatever other protocol that sends unencrypted passwords. > Access controls can be set up on the server side while allowing > anonymous access for checkout. There are many excellent SVN for > every OS, so that should not be a problem. On Mac OSX the most convenient way I have found is through fink. It does ask to install 30 other dependencies, which had me balk at first, but me doing it by hand is even worse than fink doing it, so I finally gave in and it's really a breeze. I've not had a single issue. From a sysadmin perspective, what might be worth keeping in mind is that svn is going to store everything in a database (BerkeleyDB I think). I.e., there is no such thing anymore as restoring individual source code files from backup if one gets accidentally corrupted on the server. It seems you have to restore the entire database, i.e., the entire repository. I vaguely recall though that how svn manages the repository is actually configurable and that other storage than DB is possible too. Don't ask me for the pros and cons of one vs the other. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Sat Jun 16 17:09:18 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). Rutger -----Original Message----- > Date: Sat Jun 16 07:55:09 PDT 2007 > From: "Hilmar Lapp" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Sean Davis" > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. > > > Access controls can be set up on the server side while allowing > > anonymous access for checkout. There are many excellent SVN for > > every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rvos at interchange.ubc.ca Sat Jun 16 17:15:45 2007 From: rvos at interchange.ubc.ca (rvos) Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore). Rutger -----Original Message----- > Date: Sat Jun 16 10:09:18 PDT 2007 > From: "rvos" > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > To: "Hilmar Lapp" , "Sean Davis" > > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > > -----Original Message----- > > > Date: Sat Jun 16 07:55:09 PDT 2007 > > From: "Hilmar Lapp" > > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy > > To: "Sean Davis" > > > > > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > > > As for access, the typical access is over http (or https). > > > > We're using svn+ssh here (NESCent) so the password is the same as the > > one you set for your account on the server, and you can use public/ > > private key negotiation for authentication. > > > > I think the ability to not provide a password for every single > > interaction is a requirement. If that requires using svn+ssh or can > > be made to work through https too I don't know. On sf.net I have to > > use https for svn and it doesn't ask me for the password each time. > > Not sure how this works though, maybe some local caching? > > > > We should not be using http, or whatever other protocol that sends > > unencrypted passwords. > > > > > Access controls can be set up on the server side while allowing > > > anonymous access for checkout. There are many excellent SVN for > > > every OS, so that should not be a problem. > > > > On Mac OSX the most convenient way I have found is through fink. It > > does ask to install 30 other dependencies, which had me balk at > > first, but me doing it by hand is even worse than fink doing it, so I > > finally gave in and it's really a breeze. I've not had a single issue. > > > > From a sysadmin perspective, what might be worth keeping in mind is > > that svn is going to store everything in a database (BerkeleyDB I > > think). I.e., there is no such thing anymore as restoring individual > > source code files from backup if one gets accidentally corrupted on > > the server. It seems you have to restore the entire database, i.e., > > the entire repository. I vaguely recall though that how svn manages > > the repository is actually configurable and that other storage than > > DB is possible too. Don't ask me for the pros and cons of one vs the > > other. > > > > -hilmar > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From george.heller at yahoo.com Sat Jun 16 17:29:26 2007 From: george.heller at yahoo.com (George Heller) Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com> Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? George --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From bix at sendu.me.uk Sat Jun 16 18:21:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 16 Jun 2007 19:21:38 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com> References: <959624.48556.qm@web56502.mail.re3.yahoo.com> Message-ID: <46742A32.90305@sendu.me.uk> George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). From cjfields at uiuc.edu Sat Jun 16 19:23:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:23:43 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> Message-ID: On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote: > > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > >> As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) so the password is the same as the > one you set for your account on the server, and you can use public/ > private key negotiation for authentication. > > I think the ability to not provide a password for every single > interaction is a requirement. If that requires using svn+ssh or can > be made to work through https too I don't know. On sf.net I have to > use https for svn and it doesn't ask me for the password each time. > Not sure how this works though, maybe some local caching? > > We should not be using http, or whatever other protocol that sends > unencrypted passwords. Agreed; it should be through ssh. >> Access controls can be set up on the server side while allowing >> anonymous access for checkout. There are many excellent SVN for >> every OS, so that should not be a problem. > > On Mac OSX the most convenient way I have found is through fink. It > does ask to install 30 other dependencies, which had me balk at > first, but me doing it by hand is even worse than fink doing it, so I > finally gave in and it's really a breeze. I've not had a single issue. > > From a sysadmin perspective, what might be worth keeping in mind is > that svn is going to store everything in a database (BerkeleyDB I > think). I.e., there is no such thing anymore as restoring individual > source code files from backup if one gets accidentally corrupted on > the server. It seems you have to restore the entire database, i.e., > the entire repository. I vaguely recall though that how svn manages > the repository is actually configurable and that other storage than > DB is possible too. Don't ask me for the pros and cons of one vs the > other. MacPorts/DarwinPorts also has subversion, various language bindings, cvs2svn, and various perl modules. There are also a few SVN GUIs lingering around (including live folders within Komodo). chris From cjfields at uiuc.edu Sat Jun 16 19:18:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 16 Jun 2007 14:18:06 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu> I think it's viable as an option if the code really needs it. After 100+ commits some of the code has schizy coding styles, so cleaning it up helps. In those cases having a perltidy config file present wouldn't hurt. However I agree that it shouldn't be applied across every module and should be done judiciously (the commit message, for instance, should actually state the code was tidied). chris PS - Nice to see the ball is rolling on SVN! On Jun 16, 2007, at 12:15 PM, rvos wrote: > A brief word on the topic of perltidy: no. I like what it does, and > I sort of follow one of its settings (-syn -sob -b), but if you run > it on a whole source tree it'll screw up the diffs, and I'm still > worried about it breaking things (though really it shouldn't, it > creates a *.bak if something doesn't compile anymore). > > Rutger > > > > -----Original Message----- > >> Date: Sat Jun 16 10:09:18 PDT 2007 >> From: "rvos" >> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >> To: "Hilmar Lapp" , "Sean Davis" >> >> >> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales >> talk has been expended over it already, for my own purpose I like >> the integration with eclipse (through subclipse plugin) and >> komodo, in addition to the atomic commits (so I can ctrl+c if I >> goof up (again)). >> >> For standalone use on osx I didn't use the fink one, but I forgot >> where I did get it from. It was very easy to set up, though. On >> windows there is a really nice standalone one (tortoisesvn) that >> integrates with the explorer so you can see on the file icons what >> the state of a file is. I know that there's a cvs2svn utility that >> converts your revision history (seems a requirement). >> >> Rutger >> >> >> -----Original Message----- >> >>> Date: Sat Jun 16 07:55:09 PDT 2007 >>> From: "Hilmar Lapp" >>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy >>> To: "Sean Davis" >>> >>> >>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >>> >>>> As for access, the typical access is over http (or https). >>> >>> We're using svn+ssh here (NESCent) so the password is the same as >>> the >>> one you set for your account on the server, and you can use public/ >>> private key negotiation for authentication. >>> >>> I think the ability to not provide a password for every single >>> interaction is a requirement. If that requires using svn+ssh or can >>> be made to work through https too I don't know. On sf.net I have to >>> use https for svn and it doesn't ask me for the password each time. >>> Not sure how this works though, maybe some local caching? >>> >>> We should not be using http, or whatever other protocol that sends >>> unencrypted passwords. >>> >>>> Access controls can be set up on the server side while allowing >>>> anonymous access for checkout. There are many excellent SVN for >>>> every OS, so that should not be a problem. >>> >>> On Mac OSX the most convenient way I have found is through fink. It >>> does ask to install 30 other dependencies, which had me balk at >>> first, but me doing it by hand is even worse than fink doing it, >>> so I >>> finally gave in and it's really a breeze. I've not had a single >>> issue. >>> >>> From a sysadmin perspective, what might be worth keeping in >>> mind is >>> that svn is going to store everything in a database (BerkeleyDB I >>> think). I.e., there is no such thing anymore as restoring individual >>> source code files from backup if one gets accidentally corrupted on >>> the server. It seems you have to restore the entire database, i.e., >>> the entire repository. I vaguely recall though that how svn manages >>> the repository is actually configurable and that other storage than >>> DB is possible too. Don't ask me for the pros and cons of one vs the >>> other. >>> >>> -hilmar >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hartzell at alerce.com Sat Jun 16 17:47:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 10:47:01 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> Message-ID: <18036.8725.29073.619527@almost.alerce.com> Chris Fields writes: > Ah, got it. Sorry. > > George, planning on taking this up? I'm going to take a *peek*. I just finished (unless someone finds another issue) moving someone's cvs repository over to svn, so I have some tools cobbled together and some knowledge in the cache. I don't have too much idle time at the moment though, so if it gets gooey I'll just summarize what I learn. Either way it seems worth a peek. I will need the repository itself though. I'll post a note to support at open-bio.org. g. From jason at bioperl.org Sat Jun 16 23:54:18 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 16:54:18 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18036.8725.29073.619527@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu> <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net> <18036.8725.29073.619527@almost.alerce.com> Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org> Thanks George. I'll respond to your support ticket as well but I put up tarballs of the repository as of today. I had thought at one point ChrisD might have setup rsync-able access to the whole repostitory through code.open-bio.org but for now I have put up tarballs of most of the CVS dirs from bioperl http://bioperl.org/uploads/ Just to say I already went through all the steps of running cvs2svn myself and had problems gathering back out the branches and all the tags when I tried it. If you want to start with a smaller repository like bioperl-network or bioperl-db as the initial cvs2svn conversion script took quite a long time to run on bioperl-live. Regarding ssh/https: We have already gone through some of this for blipkit and biojava projects. I think we'll still keep separate anonymous read-only (code.open-bio.org) and writeable repositories (dev.open-bio.org) as I think we are resisting any webapps on the developement server as we want that to as locked down as possible. For the newly created svn repositories that I've been creating/using I just use svn+ssh and that worked okay. -jason On Jun 16, 2007, at 10:47 AM, George Hartzell wrote: > Chris Fields writes: >> Ah, got it. Sorry. >> >> George, planning on taking this up? > > I'm going to take a *peek*. I just finished (unless someone finds > another issue) moving someone's cvs repository over to svn, so I have > some tools cobbled together and some knowledge in the cache. > > I don't have too much idle time at the moment though, so if it gets > gooey I'll just summarize what I learn. Either way it seems worth a > peek. > > I will need the repository itself though. I'll post a note to > support at open-bio.org. > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hartzell at alerce.com Sat Jun 16 23:56:09 2007 From: hartzell at alerce.com (George Hartzell) Date: Sat, 16 Jun 2007 16:56:09 -0700 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46739D69.4090204@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <18035.14352.963113.473274@almost.alerce.com> <46739D69.4090204@sheffield.ac.uk> Message-ID: <18036.30873.609341.181853@almost.alerce.com> Nathan S. Haigh writes: > [...] > Sounds like George might know what he's doing! Hey, I've been looking for a Marketing Director. Want a job? > I have a question about > setting up svn access. I believe access can be done in several ways, > over webdav, over ssh and probably others too. Do you have any knowledge > about the benefits of one over the other? I suppose I'm thinking of what > to implement to allow anonymous read access for users and authenticated > access for developers. There are two and a half ways to talk to the repository: - You can put it behind a web server (e.g. apache) and get at it using http/https. Authentication and authorization happen using the normal web server tricks, so as long as you don't do anything silly (e.g. don't use basic auth, stick with mod_auth_digest), even http connections won't send passwords in the clear. You can define users in .htpassword files or use any of the fancier setup (e.g. sql databases, etc...). - You can talk to it via subversion's simple server, svnserve. There are two ways you usually talk to svnserve (neither of which send passwords in the clear): * directly, using a URL like svn:/svn.example.com/repo/proj/trunk when you do this the client either talks directly to a copy of svnserve running as a daemon, or possibly to something like inetd that'll start an svnserve as necessary. In this case, you define authen. and author. info in an svnserve.conf file. * indirectly, using a URL like svn+ssh://svn.example.com/repo/proj/trunk/ in which case you make an ssh connection to the server machine (and authenticate via ssh mechanisms, anything other than a key-pair will drive you nuts with repeated password requests) and then an svnserve process is started up for you in "tunnel mode". Access control is coarse grained an via OS level access permisions. Generally in this case you need to give out shell accounts to everyone involved, or (tsk, tsk) have them use a common account. There's a cute trick in the svn book that shows how to use a shared ssh account but still have all of the changes in the repo keep track of the real user. I've never tried it.... - If you're on the same machine as the repo, you can do this simple: file:///path/to/repo/proj/trunk The biggest deciding factor is how you want to manage your users and whether you're already messing around with a web server. I've generally worked in small group and everyone's had ssh access, but I've set it up the other ways too. You can even access via multiple paths. The only trick is that the repository needs to be writable by whoever's committing, and if they're running svnserve themselves (file: or svn+ssh:) and things aren't set up right (all the dirs in the repo need to be group writable and have the magic bit set so that any new stuff created is also writable, users umasks and group membership need to be aligned) then things go fubar. Google's your friend here, and each of the OS's/distro's has a standard hack for making this work, usually involving a wrapper app that takes care of things. Feel free to ask any particular questions. Phew, g. From jason at bioperl.org Sun Jun 17 00:17:58 2007 From: jason at bioperl.org (Jason Stajich) Date: Sat, 16 Jun 2007 17:17:58 -0700 Subject: [Bioperl-l] seq doesn't validate error In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu> References: <200706151558.12911.sheris@eps.berkeley.edu> <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu> <200706151653.04135.sheris@eps.berkeley.edu> Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org> There error is clearly saying there must be a symbol or letter in your sequence that violates the regexp. I had modified the code in CVS to actually provide a more informative mismatch error in the error message, but this probably not in the release you are using. Anyways, add this to see what is causing the problem: print join(",",($nstarthash{$_}[1] =~ /([^ $Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n"; -jason On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote: > Thanks for the suggestion, but that still gives the same error as > before. > > On Friday 15 June 2007 4:11 pm, Kevin Brown wrote: >>> I'm getting an error as follows when I try to reverse >>> complement a sequence string stored in a hash of arrays. The >>> storage code is: >>> >>> $nstarthash{$key} = [$sortchecks[0], join("", >>> @nseq), >>> join("",@{$seqhash{$key}})]; >>> >>> the sequence of interest is the element at index 1. >>> >>> Later, I try to retrieve this string for a subset of keys so >>> I can reverse complement it based on input from another hash >>> (%complement): >>> >>> my %revcomphash = map { my $read = $_; >>> grep $complement{$read} eq 'C', %complement; >>> {$_, (Bio::Seq->new(-seq >>> =>$nstarthash{$_}[1]))->revcom->seq()};} >>> keys(%nstarthash); >>> >>> >>> I get the following warning (long sequence edited for clarity): >>> >>> -- -------------------- WARNING --------------------- >>> MSG: seq doesn't validate, mismatch is 1 >>> --------------------------------------------------- >>> >>> ------------- EXCEPTION ------------- >>> MSG: Attempting to set the sequence to >>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] >>> which does not look healthy >>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268 >>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217 >>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK >>> toplevel ../quality_wrapper.pl:103 >>> >>> I cannot find any non-allowed characters in the sequence, and >>> the de-referencing appears to work correctly. Can anyone help me? >>> I'm using the latest Bioperl installation (1.5.2) with >>> ActivePerl5.8 on a Mepis 6.5 system. >> >> Try telling the Bio::Seq object what alphabet to use when creating >> it. >> I tend to create them like: >> >> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna') > > -- > Sheri Simmons > Department of Earth and Planetary Sciences > University of California, Berkeley > Berkeley, CA 94720-4767 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From n.haigh at sheffield.ac.uk Sun Jun 17 11:45:11 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 17 Jun 2007 12:45:11 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <46751EC7.8020609@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 rvos wrote: > CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)). > > For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement). > > Rutger > > Just to clarify, subversion is available as command line for windows: http://subversion.tigris.org/project_packages.html TortoiseSVN is another svn client with a GUI that integrates into the shell. I tried setting this up a while back to use ssh (via PUTTY), but I wasn't successful. This may have been due to me just starting out with svn or that it was harder to setup in an earlier version of TortoiseSVN. Does anyone have experience of setting up svn on Windows to use ssh? If the changeover takes place, I'm happy to write some howto's for setting up svn clients for Windows. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v 8xHJvn/Eqf9LePR3Ei0ZaIw= =t5pN -----END PGP SIGNATURE----- From george.heller at yahoo.com Sun Jun 17 18:41:55 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com> Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: > Hi all, > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! From jason at bioperl.org Sun Jun 17 20:48:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Sun, 17 Jun 2007 13:48:05 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com> References: <148654.15952.qm@web56511.mail.re3.yahoo.com> Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: > Hi all, > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > Thanks. > George > > Sendu Bala wrote: > George Heller wrote: >> Hi all, >> >> I am looking at extracting the taxonomy hierarchy for some taxon ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From aaron.j.mackey at gsk.com Mon Jun 18 02:35:42 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:35:42 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46742A32.90305@sendu.me.uk> Message-ID: To do so efficiently, you might want to check out: http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html -Aaron bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM: > George Heller wrote: > > Hi all, > > > > I am looking at extracting the taxonomy hierarchy for some taxon ids. > > What I plan to do is, for a given taxon id, say 33090, I want to > > extract all taxon ids that are children of this species. I do not > > just want the immediate children, but the children's children and so > > on. > > > > Any ideas on the way I can go about doing this? > > Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > If you happen to code up something neat and efficient, why not share it > with us and we could add it to the Taxonomy module(s). > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From aaron.j.mackey at gsk.com Mon Jun 18 02:34:12 2007 From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com) Date: Sun, 17 Jun 2007 22:34:12 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: Message-ID: > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: > > > As for access, the typical access is over http (or https). > > We're using svn+ssh here (NESCent) Let me just note that https is preferable to ssh for those poor slobs stuck behind a corporate firewall (svn happily prompts me for my proxy server's user/pass, then my https authentication realm's user/pass - all then get cached in some .svn/ file that I don't have to worry about again until my proxy server password changes once a month ...) -Aaron From george.heller at yahoo.com Mon Jun 18 04:21:45 2007 From: george.heller at yahoo.com (George Heller) Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org> Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com> Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. From bix at sendu.me.uk Mon Jun 18 10:44:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:44:00 +0100 Subject: [Bioperl-l] Network tests overhaul Message-ID: <467661F0.2060703@sendu.me.uk> When the test suite runs currently, most (the intent is all) tests skip if the test would require network (internet) access. This is to avoid tests failing not due to bugs in Bioperl code, but due to temporarily inaccessible servers. This is also to make running the test suite faster. To do a complete test you currently have to set BIOPERLDEBUG to true, which activates the network test but also increases verbosity. This actually causes a problem, since when running the entire test suite the additional debug information is more a hindrance than a help, since the reams of printed information can hide significant warnings that may also get printed. Its also ugly. The solution is to divorce activation of network tests from the request for verbosity. The obvious implementation is to have another environment variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do something more appropriate. The running of networking tests should be a choice given to every end-user installing Bioperl. Debugging information, on the other hand, is only of interest to the developer working on a specific module under test, so can be left as a 'hidden' env var. I have just committed one possible implementation along these lines. You say: perl Build.PL as normal, and if you seem to have internet access it asks you if you'd like to run network tests. The default answer is no. If you answer yes, network tests will be enabled. You can alternatively say: perl Build.PL --network and if you seem to have internet access, network tests will be enabled. Then you run the tests: ./Build test Any tests written to support the new system will then skip network tests if they haven't been enabled. The only test I've written to support the new system is t/RemoteBlast.t: ./Build test --test_files t/RemoteBlast.t --verbose Adding support to test scripts consists of the following changes: + use Module::Build; + my $build = Module::Build->current(get_options => { network => {} }); + my $do_network_tests = $build->notes('network'); ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests --- ! if (!$do_network_tests) { # skip network tests I propose adding this support to all test scripts that carry out network tests. Does anyone have objections? Does anyone have alternate implementations that may be superior? I specifically suggest we don't use an env var in addition to the above, because the multiple ways of doing things could lead to confusion. Which takes priority? Did a user really have the networking tests turned on when he reported his test results? The one thing I need help with is identifying which tests attempt to access the internet. I think we caught most of them for the 1.5.2 release, but I think there are more lurking around. Can anyone offer a way to systematically find at least the test scripts which access the internet, if not the specific tests within? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 10:46:17 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 11:46:17 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: <46766279.7050202@sendu.me.uk> Sendu Bala wrote: > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => {} }); That should read: + my $build = Module::Build->current(); > + my $do_network_tests = $build->notes('network'); From cjfields at uiuc.edu Mon Jun 18 11:45:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 06:45:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <46766279.7050202@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: The idea sounds good, though if we plan on doing this we need to update the Test HOWTO as well. Some modules require only a few (<50% of the total) network tests; I think SeqFeature.t may be one, though I'm not sure. Does this handle those cases? chris On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Adding support to test scripts consists of the following changes: >> >> + use Module::Build; >> + my $build = Module::Build->current(get_options => { network => >> {} }); > > That should read: > + my $build = Module::Build->current(); > >> + my $do_network_tests = $build->notes('network'); > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Jun 18 11:49:18 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 12:49:18 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk> Message-ID: <4676713E.1000508@sendu.me.uk> Chris Fields wrote: > The idea sounds good, though if we plan on doing this we need to update > the Test HOWTO as well. > > Some modules require only a few (<50% of the total) network tests; I > think SeqFeature.t may be one, though I'm not sure. Does this handle > those cases? Yes, the system just gives the test script a boolean describing if network tests should be run. The script can then do whatever it wants with the boolean. Skip all tests, skip no tests, skip just some tests... its a drop-in replacement for the current 'debug' boolean used based on BIOPERLDEBUG. From hlapp at gmx.net Mon Jun 18 12:38:25 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:38:25 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com> References: <487845.37410.qm@web56510.mail.re3.yahoo.com> Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 12:44:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:44:22 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Just curious - how do you cvs commit then to an external repository? Is that open in the firewall? It is true though that corporations typically will not permit any encrypted outgoing traffic through their firewall except https. sf.net only supports https for svn, AFAIK. -hilmar On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass > - all > then get cached in some .svn/ file that I don't have to worry about > again > until my proxy server password changes once a month ...) > > -Aaron > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jun 18 12:47:56 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 08:47:56 -0400 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Sounds like a great idea to me. -hilmar On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote: > When the test suite runs currently, most (the intent is all) tests > skip > if the test would require network (internet) access. This is to avoid > tests failing not due to bugs in Bioperl code, but due to temporarily > inaccessible servers. This is also to make running the test suite > faster. > > To do a complete test you currently have to set BIOPERLDEBUG to true, > which activates the network test but also increases verbosity. This > actually causes a problem, since when running the entire test suite > the > additional debug information is more a hindrance than a help, since > the > reams of printed information can hide significant warnings that may > also > get printed. Its also ugly. > > The solution is to divorce activation of network tests from the > request > for verbosity. The obvious implementation is to have another > environment > variable, perhaps BIOPERLNETWORK. However, there is an opportunity > to do > something more appropriate. The running of networking tests should > be a > choice given to every end-user installing Bioperl. Debugging > information, on the other hand, is only of interest to the developer > working on a specific module under test, so can be left as a 'hidden' > env var. > > > I have just committed one possible implementation along these lines. > > You say: > perl Build.PL > as normal, and if you seem to have internet access it asks you if > you'd > like to run network tests. The default answer is no. If you answer > yes, > network tests will be enabled. > > You can alternatively say: > perl Build.PL --network > and if you seem to have internet access, network tests will be > enabled. > > Then you run the tests: > ./Build test > Any tests written to support the new system will then skip network > tests > if they haven't been enabled. > > The only test I've written to support the new system is t/ > RemoteBlast.t: > ./Build test --test_files t/RemoteBlast.t --verbose > > > Adding support to test scripts consists of the following changes: > > + use Module::Build; > + my $build = Module::Build->current(get_options => { network => > {} }); > + my $do_network_tests = $build->notes('network'); > > ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests > --- > ! if (!$do_network_tests) { # skip network tests > > > I propose adding this support to all test scripts that carry out > network > tests. Does anyone have objections? Does anyone have alternate > implementations that may be superior? > > I specifically suggest we don't use an env var in addition to the > above, > because the multiple ways of doing things could lead to confusion. > Which > takes priority? Did a user really have the networking tests turned on > when he reported his test results? > > > The one thing I need help with is identifying which tests attempt to > access the internet. I think we caught most of them for the 1.5.2 > release, but I think there are more lurking around. Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 12:55:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 07:55:53 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote: > Just curious - how do you cvs commit then to an external repository? > Is that open in the firewall? > > It is true though that corporations typically will not permit any > encrypted outgoing traffic through their firewall except https. > sf.net only supports https for svn, AFAIK. > > -hilmar If so it may be better to allow https, though I don't know how Chris D. and others feel about it. Did we make a decision as to the fate of cvs if we get svn up-and- running? Keep it around (assuming svn commits would be carried over to cvs and vice versa)? Or see what happens over time? chris From sdavis2 at mail.nih.gov Mon Jun 18 13:05:50 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 09:05:50 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: Message-ID: <4676832E.5080704@mail.nih.gov> aaron.j.mackey at gsk.com wrote: >> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote: >> >>> As for access, the typical access is over http (or https). >> We're using svn+ssh here (NESCent) > > Let me just note that https is preferable to ssh for those poor slobs > stuck behind a corporate firewall (svn happily prompts me for my proxy > server's user/pass, then my https authentication realm's user/pass - all > then get cached in some .svn/ file that I don't have to worry about again > until my proxy server password changes once a month ...) That would be my suggestion as well (although I added it only parenthetically). Sean From hlapp at gmx.net Mon Jun 18 13:13:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 09:13:27 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried > over to cvs and vice versa)? Or see what happens over time? Let's not plan for having cvs and svn writable repositories in parallel - that would create an administrative nightmare. Once the tests complete, there'll be a clean cut-over. What Jason suggested is to try and continue a read-only (anonymous) cvs repository, updated from the svn repository that the developers use, aside from an anonymous svn repository mirroring the writable one. This would primarily be for maintaining working URLs for those folks who http-linked into the anonymous cvs repository. What I added earlier is that even if that fails to be feasible, you can achieve the goal using some small CGI script and apache redirect to map CVS- style links to the anonymous svn repository. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 13:31:35 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:31:35 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net> Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu> On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote: > > On Jun 18, 2007, at 8:55 AM, Chris Fields wrote: > >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Let's not plan for having cvs and svn writable repositories in > parallel - that would create an administrative nightmare. Once the > tests complete, there'll be a clean cut-over. My thoughts as well. Much simpler. > What Jason suggested is to try and continue a read-only (anonymous) > cvs repository, updated from the svn repository that the developers > use, aside from an anonymous svn repository mirroring the writable > one. This would primarily be for maintaining working URLs for those > folks who http-linked into the anonymous cvs repository. What I > added earlier is that even if that fails to be feasible, you can > achieve the goal using some small CGI script and apache redirect to > map CVS-style links to the anonymous svn repository. > > -hilmar I like the idea of a read-only cvs or a 'faux' cvs, though the former would initially be easier as we already have it available. We could just lock it down at some switchover point to read-only (something I think Jason also suggested). chris From bix at sendu.me.uk Mon Jun 18 13:13:33 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:13:33 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> Message-ID: <467684FD.3080300@sendu.me.uk> Chris Fields wrote: > > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >> If its going to be difficult and a hassle, for such an unnecessary >> thing I'm not sure its worth it. There are more pressing things to be >> done for Bioperl. >> >> If I can just run perltidy on the entire package and commit, I'd do >> it. If that's not appropriate, I won't. > > The choices aren't necessarily all or nothing. What about voluntary, > recommended use of a perltidy config file included with the > distribution, with additional 'caveats'? I'm happy with that idea. Why not come up with something and make it available for us to try out? Cheers, Sendu. From bix at sendu.me.uk Mon Jun 18 13:26:36 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 14:26:36 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> Message-ID: <4676880C.9030009@sendu.me.uk> Chris Fields wrote: > If so it may be better to allow https, though I don't know how Chris > D. and others feel about it. If it makes no difference to me as an end-user, I won't mind. But I won't want to enter my password even once, at the beginning of a session. If that's not possible with https, then ssh should be an option as well. Unrelated, but it randomly just occurred to me: what happens to all the id lines at the top of modules? Eg: $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ That's a cvs-specific thing, right? Do we delete them all? (Regardless, I wish we would, since they caused me no end of hassles during the 1.5.2 release, doing updates across branches.) > Did we make a decision as to the fate of cvs if we get svn up-and- > running? Keep it around (assuming svn commits would be carried over > to cvs and vice versa)? Or see what happens over time? Well, I don't think hard decisions are possible until we know how its going to work in practice. I tried setting up my own svn repository once, but didn't keep it and can't remember much about it. So, I suppose we'll play it by ear and decide things later. Is someone out there actively doing something leading toward a demonstration of how it will be? From cjfields at uiuc.edu Mon Jun 18 13:58:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 08:58:34 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <467684FD.3080300@sendu.me.uk> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote: >>> If its going to be difficult and a hassle, for such an unnecessary >>> thing I'm not sure its worth it. There are more pressing things >>> to be >>> done for Bioperl. >>> >>> If I can just run perltidy on the entire package and commit, I'd do >>> it. If that's not appropriate, I won't. >> >> The choices aren't necessarily all or nothing. What about voluntary, >> recommended use of a perltidy config file included with the >> distribution, with additional 'caveats'? > > I'm happy with that idea. Why not come up with something and make it > available for us to try out? > > > Cheers, > Sendu. Will do. Maybe something that conforms to PBP; there's a PBP perltidy config on perlmonks, along with some emacs/vim related bits: http://www.perlmonks.org/?node_id=516501 chris From sdavis2 at mail.nih.gov Mon Jun 18 14:03:35 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 18 Jun 2007 10:03:35 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <467690B7.7090105@mail.nih.gov> Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how Chris >> D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an option > as well. > > > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) See here: http://svnbook.red-bean.com/en/1.0/ch07s02.html Check out the section at the bottom having to do with svn:keywords. Sean From akarger at CGR.Harvard.edu Mon Jun 18 14:10:57 2007 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 18 Jun 2007 10:10:57 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <46751EC7.8020609@sheffield.ac.uk> References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca> <46751EC7.8020609@sheffield.ac.uk> Message-ID: > Just to clarify, subversion is available as command line for windows: > http://subversion.tigris.org/project_packages.html > > TortoiseSVN is another svn client with a GUI that integrates into the > shell. I tried setting this up a while back to use ssh (via > PUTTY), but > I wasn't successful. This may have been due to me just > starting out with > svn or that it was harder to setup in an earlier version of > TortoiseSVN. > > Does anyone have experience of setting up svn on Windows to > use ssh? If > the changeover takes place, I'm happy to write some howto's > for setting > up svn clients for Windows. Here are some notes I wrote recently. I'm using this with command-line svn, not TortoiseSVN. I would hope that it would work with Tortoise, too, but I can't guarantee. 1. Run PuTTYgen (installed with PuTTY, probably in Start menu->Programs->PuTTY) and follow directions to create a private key file like C:\someplace\private_key.ppk and a public key. At this point, you'll pick an ssh password, which is separate from your login password. 2. Get an account with the appropriate .ssh/authorized_keys file on the host machine. (This is not Windows-specific. By the way, if you change the lines of the authorized_keys file to start with, e.g., command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB... comment then (a) you're more secure because users can't open a real shell on the computer, and (b) users don't need to type the repository directory in their svn co commands.) 3. Set your environment variables (My Computer->Properties. Advanced Tab, click on Environment Variables. In the top half ("User variables for ..."), click "New" and put in the variable name and value. 3a. Set the SVN_EDITOR environment variable to your favorite editor, such as vim or emacs, or a full path to some other editor. If it's not set, then either VISUAL or EDITOR must be set. 3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program, which is the Windows equivalent of command-line ssh. If you installed PuTTY in the default location, set it to "C:/Program Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the quotation marks in the environment variable. 4. When you want to start using svn, you'll need to run Pageant (Start menu->Programs->PuTTY), select "Add Key", browse to your private key file, and enter the ssh password you chose in step 1 (not your login password). Pageant will stay running until you quit it or logout, so you can have multiple svn checkins etc., and you only need to type in your password once. 5. Now just run command-line svn commands the same way you would on UNIX (modulo Windows' brain-dead shell). -Amir Karger From cjfields at uiuc.edu Mon Jun 18 14:24:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 09:24:00 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <4676880C.9030009@sendu.me.uk> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote: > Chris Fields wrote: >> If so it may be better to allow https, though I don't know how >> Chris D. and others feel about it. > > If it makes no difference to me as an end-user, I won't mind. But I > won't want to enter my password even once, at the beginning of a > session. If that's not possible with https, then ssh should be an > option as well. Aaron pointed out in a related post that https access is the preferred option behind a corporate firewall (svn prompts for proxy user/pass, then caches it). Not sure how Jason/Hilmar/Chris D. feel about https or supporting both https+ssh. ... >> Did we make a decision as to the fate of cvs if we get svn up-and- >> running? Keep it around (assuming svn commits would be carried >> over to cvs and vice versa)? Or see what happens over time? > > Well, I don't think hard decisions are possible until we know how > its going to work in practice. I tried setting up my own svn > repository once, but didn't keep it and can't remember much about it. Agree; we'll need to work out specifics once we know how things work out using cvs2svn. I think the idea is to test using a smaller distribution (maybe network or db) and move up from there. > So, I suppose we'll play it by ear and decide things later. Is > someone out there actively doing something leading toward a > demonstration of how it will be? George Hartzell is going to test it out, I believe, and will post something when he can. chris From dmessina at wustl.edu Mon Jun 18 14:54:31 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 09:54:31 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> [Chris F] > Will do. Maybe something that conforms to PBP; there's a PBP > perltidy config on perlmonks, along with some emacs/vim related bits: > > http://www.perlmonks.org/?node_id=516501 FYI, perltidy now has a built-in -pbp flag: [from perltidy-20070508] > -pbp, --perl-best-practices > -pbp is an abbreviation for the parameters in the book Perl Best > Practices by Damian Conway: > > -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 > -nsfs -nolq > -wbb="% + - * / x != == >= <= =~ !~ < > | & = > **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" > Note that the -st and -se flags make perltidy act as a filter on > one file only. These can be overridden with -nst and -nse if > necessary. > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ bin/perltidy] Dave From dmessina at wustl.edu Mon Jun 18 15:04:10 2007 From: dmessina at wustl.edu (David Messina) Date: Mon, 18 Jun 2007 10:04:10 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <467661F0.2060703@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> Message-ID: Awesome, Sendu! Really glad you implemented this. > Can anyone offer a > way to systematically find at least the test scripts which access the > internet, if not the specific tests within? I think tests would be accessing the net indirectly through a BioPerl module (which may also be using indirect access), so it'd be hard to come up with a universal glob for that. However: % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l 108 % ls -1 bioperl-live/t | wc -l 248 Less than half of the test files use BIOPERLDEBUG, so that narrows down the possibilities... Dave From bix at sendu.me.uk Mon Jun 18 15:09:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 16:09:19 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: References: <467661F0.2060703@sendu.me.uk> Message-ID: <4676A01F.30205@sendu.me.uk> David Messina wrote: >> Can anyone offer a >> way to systematically find at least the test scripts which access the >> internet, if not the specific tests within? > > I think tests would be accessing the net indirectly through a BioPerl > module (which may also be using indirect access), so it'd be hard to > come up with a universal glob for that. > > However: > > % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l > 108 > > % ls -1 bioperl-live/t | wc -l > 248 > > Less than half of the test files use BIOPERLDEBUG, so that narrows down > the possibilities... Not necessarily. The problem is that there may be test scripts that have never even tried to skip network tests, and therefore don't use BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) I was thinking along the lines of, does anyone know how to monitor accesses to the network card (or equivalent), getting information on which program (test script) requested the access? From cjfields at uiuc.edu Mon Jun 18 15:41:28 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 10:41:28 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk> <4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk> <467264C8.4020202@sendu.me.uk> <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu> <467684FD.3080300@sendu.me.uk> <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu> Message-ID: On Jun 18, 2007, at 9:54 AM, David Messina wrote: > [Chris F] >> Will do. Maybe something that conforms to PBP; there's a PBP >> perltidy config on perlmonks, along with some emacs/vim related bits: >> >> http://www.perlmonks.org/?node_id=516501 > > > FYI, perltidy now has a built-in -pbp flag: > > [from perltidy-20070508] >> -pbp, --perl-best-practices >> -pbp is an abbreviation for the parameters in the book Perl Best >> Practices by Damian Conway: >> >> -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1 >> -nsfs -nolq >> -wbb="% + - * / x != == >= <= =~ !~ < > | & = >> **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x=" >> Note that the -st and -se flags make perltidy act as a filter on >> one file only. These can be overridden with -nst and -nse if >> necessary. >> > [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ > bin/perltidy] > > > Dave Makes sense that would eventually be incorporated. If so there's no need to include a config (unless we want to sway away from PBP-style). We can just recommend everyone use that setting. chris From cjfields at uiuc.edu Mon Jun 18 16:06:26 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:06:26 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote: > David Messina wrote: >>> ... >> Less than half of the test files use BIOPERLDEBUG, so that narrows >> down >> the possibilities... > > Not necessarily. The problem is that there may be test scripts that > have > never even tried to skip network tests, and therefore don't use > BIOPERLDEBUG. (Or that chose their own way to decide when to skip.) > > I was thinking along the lines of, does anyone know how to monitor > accesses to the network card (or equivalent), getting information on > which program (test script) requested the access? EUtilities.t uses network tests predominately. I'll switch over when I commit everything from the overhaul. Couldn't you enable BIOPERLDEBUG, disable network access, then iterate through tests checking for those which fail or skip? I think Test::Harness has a way to do this, using execute_tests(). chris From bix at sendu.me.uk Mon Jun 18 16:34:38 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 18 Jun 2007 17:34:38 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> Message-ID: <4676B41E.3050706@sendu.me.uk> Chris Fields wrote: > Couldn't you enable BIOPERLDEBUG, disable network access, then iterate > through tests checking for those which fail or skip? Yes, good idea, though my dev machine is also my email/webserver so I'd rather come up with an alternate solution than one involving 'disable network access'. Still, that's what I'll probably end up doing. Cheers! Oh, Chris, Spiros, how goes the Test::More conversion? I might want to wait for you to finish, or join in? If you're not going to have time to do any more in the next few weeks, can you please update http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in the opposite case, add your name in)? Its not quite clear to me which tests are assigned to whom. Can someone clarify what the markings mean? Cheers, Sendu. From cjfields at uiuc.edu Mon Jun 18 16:43:31 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 11:43:31 -0500 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676B41E.3050706@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > Chris Fields wrote: >> Couldn't you enable BIOPERLDEBUG, disable network access, then >> iterate through tests checking for those which fail or skip? > > Yes, good idea, though my dev machine is also my email/webserver so > I'd rather come up with an alternate solution than one involving > 'disable network access'. > > Still, that's what I'll probably end up doing. Cheers! > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > to wait for you to finish, or join in? If you're not going to have > time to do any more in the next few weeks, can you please update > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > in the opposite case, add your name in)? Its not quite clear to me > which tests are assigned to whom. Can someone clarify what the > markings mean? > > Cheers, > Sendu. Not sure how far along spiros is; I handed it over after I finished up to the 'Q' tests. In general the ones marked out have been converted over, ones with names next to them have been claimed. If you need help I'll prob. start back up again to finish them off; we just need to divy them up. chris From george.heller at yahoo.com Mon Jun 18 17:07:59 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net> Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com> What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: > Thanks. And how can I assign the $node here in the below code, such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > Thanks. > George > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > You just want the extant species/leaves of the tree > > > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > Hi all, > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > Thanks. > George > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > Any ideas on the way I can go about doing this? > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. From jason at bioperl.org Mon Jun 18 17:53:28 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 10:53:28 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, > > relation "node" does not exist. > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > shift->throw_not_implemented(); > > Thanks. > George. > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > BioPerl doesn't have a Taxonomy::biosql module yet (though this would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download > to achieve what you wanted to do in a less than 5 lines of perl. > > Although the recursive implementation of Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > -hilmar > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > >> Thanks. And how can I assign the $node here in the below code, such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> Thanks. >> George >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> You just want the extant species/leaves of the tree >> >> >> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> Hi all, >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> Thanks. >> George >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children and so >> on. >> >> >> Any ideas on the way I can go about doing this? >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: > mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From hlapp at gmx.net Mon Jun 18 22:10:00 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:10:00 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> References: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net> <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu> <4676880C.9030009@sendu.me.uk> <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu> Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net> https is working fine for me for sf.net repositories, and I only have to enter the password upon first commit (since checkout doesn't even need a password). -hilmar On Jun 18, 2007, at 10:24 AM, Chris Fields wrote: > Not sure how Jason/Hilmar/Chris D. feel about https or supporting > both https+ssh -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 22:18:21 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org> Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com> I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. From hlapp at gmx.net Mon Jun 18 22:27:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:27:19 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com> References: <218165.62089.qm@web56505.mail.re3.yahoo.com> Message-ID: On Jun 18, 2007, at 1:07 PM, George Heller wrote: > What exactly is the "node n" in the query below. When I issue this > query, it says, Sorry, replace with "taxon". Jason answered the rest. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Jun 18 22:33:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 18 Jun 2007 17:33:40 -0500 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: > I tried running the below mentioned script and I seem to be getting > the following error: > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > My script looks something like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > And I am running the script using the command, > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > and I have the nodes.dmp and names.dmp files in the current > directory. > > Thanks, > George > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > > > -jason > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > relation "node" does not exist. > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > shift->throw_not_implemented(); > > > Thanks. > George. > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local BioSQL > database and loaded the NCBI taxonomy into the database. You can now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > However, BioPerl does have support for the flat-file download of the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > -hilmar > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > Thanks. > George > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > You just want the extant species/leaves of the tree > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descedents; > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > Hi all, > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > Thanks. > George > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children and so > on. > > > > > Any ideas on the way I can go about doing this? > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Mon Jun 18 22:50:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 18 Jun 2007 18:50:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> References: <904670.24974.qm@web56513.mail.re3.yahoo.com> <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu> Message-ID: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From george.heller at yahoo.com Mon Jun 18 23:05:42 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com> This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > As the error implies your local version of perl doesn't seem support > weak references, which means it doesn't have Scalar::Utils (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > chris > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > >> I tried running the below mentioned script and I seem to be getting >> the following error: >> >> Weak references are not implemented in the version of perl at / >> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >> Bio/Tree/Node.pm line 76. >> Compilation failed in require at my.pl line 7. >> BEGIN failed--compilation aborted at my.pl line 7. >> >> My script looks something like, >> >> #!/usr/bin/perl >> use strict; >> #use warnings; >> use DBI; >> use Bio::Tree::Node; >> use Bio::DB::Taxonomy; >> use Bio::DB::Taxonomy::flatfile; >> my $idx_dir = '/tmp'; >> >> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> foreach $field (@extant_children) { >> print "$field"; >> print "|"; >> print "\n"; >> } >> >> And I am running the script using the command, >> >> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >> >> and I have the nodes.dmp and names.dmp files in the current >> directory. >> >> Thanks, >> George >> >> >> Jason Stajich wrote: >> It is implemented in the implementing class - DB::Taxonomy is >> just the base class. For example see the flatfile implementation >> Bio::DB::Taxonomy::flatfile >> >> See the scripts/taxa/local_taxonomydb_query.PLS for example using >> it: >> nodes and names are from NCBI taxonomy database. >> >> >> Here is an un-debugged copy+paste for your question that *should* >> work. >> >> >> use Bio::DB::Taxonomy >> my $idx_dir = '/tmp'; >> >> >> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >> -nodesfile => $nodesfile, >> -namesfile => $namesfile, >> -directory => $idx_dir); >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descendents; >> >> >> >> >> -jason >> >> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >> >> What exactly is the "node n" in the query below. When I issue >> this query, it says, >> >> >> relation "node" does not exist. >> >> >> I tried to use the get_all_Descendents method but it looks like >> in order to do a recursive call it calls the method >> each_Descendent. This method is not implemented in >> Bio::DB::Taxonomy. It just has a single line, >> >> >> shift->throw_not_implemented(); >> >> >> Thanks. >> George. >> >> >> Hilmar Lapp wrote: >> I'm a bit confused - it sounds like you have set up a local >> BioSQL >> database and loaded the NCBI taxonomy into the database. You can >> now >> use simple SQL to retrieve all descendants of a node in the tree >> given its NCBI taxonID such as >> >> >> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >> WHERE >> n.ncbi_taxon_id = :taxonID >> AND tn.left_value > n. left_value >> AND tn.right_value < n.right_value >> AND tn.taxon_id = tnm.taxon_id >> AND tn.name_class = 'scientific_name' >> >> >> BioPerl doesn't have a Taxonomy::biosql module yet (though this >> would >> seem like a worthwhile thing to add), so you can't use the >> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >> >> >> However, BioPerl does have support for the flat-file download of >> the >> NCBI taxonomy database and indexes it, so you can simply use >> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >> download >> to achieve what you wanted to do in a less than 5 lines of perl. >> >> >> Although the recursive implementation of >> Taxonomy::get_all_Descendants >> () won't be lightning fast, it may still be perfectly fine for your >> application - are you sure it is not? >> >> >> -hilmar >> >> >> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >> >> >> Thanks. And how can I assign the $node here in the below code, >> such >> that I can reference it to a particular taxon id record? I want to >> retrieve all the descendents from the taxonomy hierarchy, given a >> particular taxon id. >> >> >> I have a local db setup, in which I have uploaded data using the >> load_ncbi_taxonomy.pl script. >> >> >> Thanks. >> George >> >> >> Jason Stajich wrote: >> I assume you already figured out how to setup a local taxonomydb? >> >> >> >> >> You just want the extant species/leaves of the tree >> >> >> >> >> my @extant_children = grep { $_->is_Leaf } $node- >>> get_all_Descedents; >> >> >> >> >> >> >> -jason >> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >> >> >> Hi all, >> >> >> >> >> Can anyone point me to some example that uses the >> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >> this, and I am not quite sure how to implement it. >> >> >> >> >> Thanks. >> George >> >> >> >> >> Sendu Bala wrote: >> George Heller wrote: >> Hi all, >> >> >> >> >> I am looking at extracting the taxonomy hierarchy for some taxon >> ids. >> What I plan to do is, for a given taxon id, say 33090, I want to >> extract all taxon ids that are children of this species. I do not >> just want the immediate children, but the children's children >> and so >> on. >> >> >> >> >> Any ideas on the way I can go about doing this? >> >> >> >> >> Well, you'll use Bio::DB::Taxonomy presumably, and >> each_Descendent in >> some kind of looping structure. Most easily a recursing sub. >> >> >> >> >> If you happen to code up something neat and efficient, why not >> share it >> with us and we could add it to the Taxonomy module(s). >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Shape Yahoo! in your own image. Join our Network Research Panel >> today! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Need a vacation? Get great deals to amazing places on Yahoo! >> Travel. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------- >> Take the Internet to Go: Yahoo!Go puts the Internet in your >> pocket: mail, news, photos & more. >> >> >> -- >> Jason Stajich >> jason at bioperl.org >> http://jason.open-bio.org/ >> >> >> >> >> >> >> >> --------------------------------- >> Bored stiff? Loosen up... >> Download and play hundreds of games for free on Yahoo! Games. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. From jason at bioperl.org Mon Jun 18 23:22:08 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 16:22:08 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com> References: <706979.34648.qm@web56509.mail.re3.yahoo.com> Message-ID: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: > This is the output of /usr/bin/perl -V > > Summary of my perl5 (revision 5 version 8 subversion 5) configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- > linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- > E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > Thanks. > George > . > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > George, can you please post the output of > > $ /usr/bin/perl -V > > -hilmar > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > >> As the error implies your local version of perl doesn't seem support >> weak references, which means it doesn't have Scalar::Utils (which was >> added to core after perl 5.6.1, I think). Try installing >> Scalar::Utils to see what happens. >> >> chris >> >> On Jun 18, 2007, at 5:18 PM, George Heller wrote: >> >>> I tried running the below mentioned script and I seem to be getting >>> the following error: >>> >>> Weak references are not implemented in the version of perl at / >>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 >>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ >>> Bio/Tree/Node.pm line 76. >>> Compilation failed in require at my.pl line 7. >>> BEGIN failed--compilation aborted at my.pl line 7. >>> >>> My script looks something like, >>> >>> #!/usr/bin/perl >>> use strict; >>> #use warnings; >>> use DBI; >>> use Bio::Tree::Node; >>> use Bio::DB::Taxonomy; >>> use Bio::DB::Taxonomy::flatfile; >>> my $idx_dir = '/tmp'; >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> foreach $field (@extant_children) { >>> print "$field"; >>> print "|"; >>> print "\n"; >>> } >>> >>> And I am running the script using the command, >>> >>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp >>> >>> and I have the nodes.dmp and names.dmp files in the current >>> directory. >>> >>> Thanks, >>> George >>> >>> >>> Jason Stajich wrote: >>> It is implemented in the implementing class - DB::Taxonomy is >>> just the base class. For example see the flatfile implementation >>> Bio::DB::Taxonomy::flatfile >>> >>> See the scripts/taxa/local_taxonomydb_query.PLS for example using >>> it: >>> nodes and names are from NCBI taxonomy database. >>> >>> >>> Here is an un-debugged copy+paste for your question that *should* >>> work. >>> >>> >>> use Bio::DB::Taxonomy >>> my $idx_dir = '/tmp'; >>> >>> >>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); >>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile', >>> -nodesfile => $nodesfile, >>> -namesfile => $namesfile, >>> -directory => $idx_dir); >>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descendents; >>> >>> >>> >>> >>> -jason >>> >>> On Jun 18, 2007, at 10:07 AM, George Heller wrote: >>> >>> What exactly is the "node n" in the query below. When I issue >>> this query, it says, >>> >>> >>> relation "node" does not exist. >>> >>> >>> I tried to use the get_all_Descendents method but it looks like >>> in order to do a recursive call it calls the method >>> each_Descendent. This method is not implemented in >>> Bio::DB::Taxonomy. It just has a single line, >>> >>> >>> shift->throw_not_implemented(); >>> >>> >>> Thanks. >>> George. >>> >>> >>> Hilmar Lapp wrote: >>> I'm a bit confused - it sounds like you have set up a local >>> BioSQL >>> database and loaded the NCBI taxonomy into the database. You can >>> now >>> use simple SQL to retrieve all descendants of a node in the tree >>> given its NCBI taxonID such as >>> >>> >>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n >>> WHERE >>> n.ncbi_taxon_id = :taxonID >>> AND tn.left_value > n. left_value >>> AND tn.right_value < n.right_value >>> AND tn.taxon_id = tnm.taxon_id >>> AND tn.name_class = 'scientific_name' >>> >>> >>> BioPerl doesn't have a Taxonomy::biosql module yet (though this >>> would >>> seem like a worthwhile thing to add), so you can't use the >>> Bio::DB::Taxonomy interface to do this against a BioSQL instance. >>> >>> >>> However, BioPerl does have support for the flat-file download of >>> the >>> NCBI taxonomy database and indexes it, so you can simply use >>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile >>> download >>> to achieve what you wanted to do in a less than 5 lines of perl. >>> >>> >>> Although the recursive implementation of >>> Taxonomy::get_all_Descendants >>> () won't be lightning fast, it may still be perfectly fine for your >>> application - are you sure it is not? >>> >>> >>> -hilmar >>> >>> >>> On Jun 18, 2007, at 12:21 AM, George Heller wrote: >>> >>> >>> Thanks. And how can I assign the $node here in the below code, >>> such >>> that I can reference it to a particular taxon id record? I want to >>> retrieve all the descendents from the taxonomy hierarchy, given a >>> particular taxon id. >>> >>> >>> I have a local db setup, in which I have uploaded data using the >>> load_ncbi_taxonomy.pl script. >>> >>> >>> Thanks. >>> George >>> >>> >>> Jason Stajich wrote: >>> I assume you already figured out how to setup a local taxonomydb? >>> >>> >>> >>> >>> You just want the extant species/leaves of the tree >>> >>> >>> >>> >>> my @extant_children = grep { $_->is_Leaf } $node- >>>> get_all_Descedents; >>> >>> >>> >>> >>> >>> >>> -jason >>> On Jun 17, 2007, at 11:41 AM, George Heller wrote: >>> >>> >>> Hi all, >>> >>> >>> >>> >>> Can anyone point me to some example that uses the >>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at >>> this, and I am not quite sure how to implement it. >>> >>> >>> >>> >>> Thanks. >>> George >>> >>> >>> >>> >>> Sendu Bala wrote: >>> George Heller wrote: >>> Hi all, >>> >>> >>> >>> >>> I am looking at extracting the taxonomy hierarchy for some taxon >>> ids. >>> What I plan to do is, for a given taxon id, say 33090, I want to >>> extract all taxon ids that are children of this species. I do not >>> just want the immediate children, but the children's children >>> and so >>> on. >>> >>> >>> >>> >>> Any ideas on the way I can go about doing this? >>> >>> >>> >>> >>> Well, you'll use Bio::DB::Taxonomy presumably, and >>> each_Descendent in >>> some kind of looping structure. Most easily a recursing sub. >>> >>> >>> >>> >>> If you happen to code up something neat and efficient, why not >>> share it >>> with us and we could add it to the Taxonomy module(s). >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Shape Yahoo! in your own image. Join our Network Research Panel >>> today! >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Need a vacation? Get great deals to amazing places on Yahoo! >>> Travel. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Take the Internet to Go: Yahoo!Go puts the Internet in your >>> pocket: mail, news, photos & more. >>> >>> >>> -- >>> Jason Stajich >>> jason at bioperl.org >>> http://jason.open-bio.org/ >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------- >>> Bored stiff? Loosen up... >>> Download and play hundreds of games for free on Yahoo! Games. >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Tue Jun 19 00:04:00 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com> Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. From jason at bioperl.org Tue Jun 19 00:17:34 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 17:17:34 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com> References: <424035.72876.qm@web56507.mail.re3.yahoo.com> Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: > Ok, I installed the latest of Scalar::Util and the script seems to > be working. But I am confused where exactly I need to look for the > descendent taxon ids once the script is run. I did look into the / > tmp/ directory, but I couldnt understand much. > > Sorry to be bothering, really appreaciate your patience. > > Thanks. > George > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > This is the output of /usr/bin/perl -V > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- > aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', > gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > Thanks. > George > . > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something strange > appears to be going on too. > > > George, can you please post the output of > > > $ /usr/bin/perl -V > > > -hilmar > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils (which > was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > chris > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > My script looks something like, > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > And I am running the script using the command, > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > Thanks, > George > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > -jason > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > relation "node" does not exist. > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > shift->throw_not_implemented(); > > > > > Thanks. > George. > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for your > application - are you sure it is not? > > > > > -hilmar > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > Thanks. > George > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > Hi all, > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > Thanks. > George > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From george.heller at yahoo.com Tue Jun 19 00:29:31 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org> Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com> But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. From jason at bioperl.org Tue Jun 19 01:05:43 2007 From: jason at bioperl.org (Jason Stajich) Date: Mon, 18 Jun 2007 18:05:43 -0700 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: > But the problem is that I don't really get any output on the > screen. In the /tmp directory I get 4 files namely parents, nodes, > id2names and names2id, but I dont know what to make of them. This > is what my script looks like, > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > my $nodefile; > my $namesfile; > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodefile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > Thanks. > George > > Jason Stajich wrote: > All the children are in this array. > > > You get to decide what you want to do with them. In the following > example I print the id, rank, and scientific name out to the screen. > Because this is a taxonomy db query you are getting back > Bio::Taxonomy::Taxon objects so read the documentation for this > module to see what you can do with the object. > I would also suggest spending a little time with the Getting > started and HOWTO:Trees documentation on the website to get > familiar with the objects and nomenclature. > > > > > my @extant_children = grep { $_->is_Leaf } $node- > >get_all_Descendents; > > > for my $child ( @extant_children ) { > print "id is ", $child->id, "\n"; # NCBI taxa id > print "rank is ", $child->rank, "\n"; # e.g. species > print "scientific name is ", $child->scientific_name, "\n"; # > scientific name > } > > > On Jun 18, 2007, at 5:04 PM, George Heller wrote: > > Ok, I installed the latest of Scalar::Util and the script seems > to be working. But I am confused where exactly I need to look for > the descendent taxon ids once the script is run. I did look into > the /tmp/ directory, but I couldnt understand much. > > > Sorry to be bothering, really appreaciate your patience. > > > Thanks. > George > > > Jason Stajich wrote: > Try installing the latest Scalar::Util > On Jun 18, 2007, at 4:05 PM, George Heller wrote: > > > This is the output of /usr/bin/perl -V > > > > > Summary of my perl5 (revision 5 version 8 subversion 5) > configuration: > Platform: > osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, > archname=i386-linux-thread-multi > uname='linux hs20-bc1-4.build.redhat.com > 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 > i686 i386 gnulinux ' > config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - > mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - > Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - > Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - > Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - > Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - > Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - > Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ > less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' > hint=recommended, useposix=true, d_sigaction=define > usethreads=define use5005threads=undef useithreads=define > usemultiplicity=define > useperlio=define d_sfio=undef uselargefiles=define > usesocks=undef > use64bitint=undef use64bitall=undef uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - > fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - > D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', > optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', > cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- > strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' > ccversion='', gccversion='3.4.6 20060404 (Red Hat > 3.4.6-2)', gccosandvers='' > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > d_longlong=define, longlongsize=8, d_longdbl=define, > longdblsize=12 > ivtype='long', ivsize=4, nvtype='double', nvsize=8, > Off_t='off_t', lseeksize=8 > alignbytes=4, prototype=define > Linker and Libraries: > ld='gcc', ldflags =' -L/usr/local/lib' > libpth=/usr/local/lib /lib /usr/lib > libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - > lpthread -lc > perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc > libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, > libperl=libperl.so > gnulibc_version='2.3.4' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- > Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > > > Characteristics of this binary (from libperl): > Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS > USE_LARGE_FILES PERL_IMPLICIT_CONTEXT > Built under linux > Compiled at Jul 24 2006 18:28:10 > @INC: > /usr/lib/perl5/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/5.8.5 > /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 > /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 > /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 > /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 > /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 > /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 > /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl > > > > > Thanks. > George > . > > > > > Hilmar Lapp wrote: > The perl version appears to be 5.8.5 though, so something > strange > appears to be going on too. > > > > > George, can you please post the output of > > > > > $ /usr/bin/perl -V > > > > > -hilmar > > > > > On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: > > > > > As the error implies your local version of perl doesn't seem > support > weak references, which means it doesn't have Scalar::Utils > (which was > added to core after perl 5.6.1, I think). Try installing > Scalar::Utils to see what happens. > > > > > chris > > > > > On Jun 18, 2007, at 5:18 PM, George Heller wrote: > > > > > I tried running the below mentioned script and I seem to be > getting > the following error: > > > > > Weak references are not implemented in the version of perl at / > usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 > BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ > 5.8.5/ > Bio/Tree/Node.pm line 76. > Compilation failed in require at my.pl line 7. > BEGIN failed--compilation aborted at my.pl line 7. > > > > > My script looks something like, > > > > > #!/usr/bin/perl > use strict; > #use warnings; > use DBI; > use Bio::Tree::Node; > use Bio::DB::Taxonomy; > use Bio::DB::Taxonomy::flatfile; > my $idx_dir = '/tmp'; > > > > > my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > foreach $field (@extant_children) { > print "$field"; > print "|"; > print "\n"; > } > > > > > And I am running the script using the command, > > > > > perl myscript.pl -v --names names.dmp --nodes nodes.dmp > > > > > and I have the nodes.dmp and names.dmp files in the current > directory. > > > > > Thanks, > George > > > > > > > > > Jason Stajich wrote: > It is implemented in the implementing class - DB::Taxonomy is > just the base class. For example see the flatfile implementation > Bio::DB::Taxonomy::flatfile > > > > > See the scripts/taxa/local_taxonomydb_query.PLS for example using > it: > nodes and names are from NCBI taxonomy database. > > > > > > > > > Here is an un-debugged copy+paste for your question that *should* > work. > > > > > > > > > use Bio::DB::Taxonomy > my $idx_dir = '/tmp'; > > > > > > > > > my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); > my $db = new Bio::DB::Taxonomy(-source => 'flatfile', > -nodesfile => $nodesfile, > -namesfile => $namesfile, > -directory => $idx_dir); > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descendents; > > > > > > > > > > > > > > > > > -jason > > > > > On Jun 18, 2007, at 10:07 AM, George Heller wrote: > > > > > What exactly is the "node n" in the query below. When I issue > this query, it says, > > > > > > > > > relation "node" does not exist. > > > > > > > > > I tried to use the get_all_Descendents method but it looks like > in order to do a recursive call it calls the method > each_Descendent. This method is not implemented in > Bio::DB::Taxonomy. It just has a single line, > > > > > > > > > shift->throw_not_implemented(); > > > > > > > > > Thanks. > George. > > > > > > > > > Hilmar Lapp wrote: > I'm a bit confused - it sounds like you have set up a local > BioSQL > database and loaded the NCBI taxonomy into the database. You can > now > use simple SQL to retrieve all descendants of a node in the tree > given its NCBI taxonID such as > > > > > > > > > SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n > WHERE > n.ncbi_taxon_id = :taxonID > AND tn.left_value > n. left_value > AND tn.right_value < n.right_value > AND tn.taxon_id = tnm.taxon_id > AND tn.name_class = 'scientific_name' > > > > > > > > > BioPerl doesn't have a Taxonomy::biosql module yet (though this > would > seem like a worthwhile thing to add), so you can't use the > Bio::DB::Taxonomy interface to do this against a BioSQL instance. > > > > > > > > > However, BioPerl does have support for the flat-file download of > the > NCBI taxonomy database and indexes it, so you can simply use > Taxonomy::{get_taxon,get_all_Descendants} using the flatfile > download > to achieve what you wanted to do in a less than 5 lines of perl. > > > > > > > > > Although the recursive implementation of > Taxonomy::get_all_Descendants > () won't be lightning fast, it may still be perfectly fine for > your > application - are you sure it is not? > > > > > > > > > -hilmar > > > > > > > > > On Jun 18, 2007, at 12:21 AM, George Heller wrote: > > > > > > > > > Thanks. And how can I assign the $node here in the below code, > such > that I can reference it to a particular taxon id record? I want to > retrieve all the descendents from the taxonomy hierarchy, given a > particular taxon id. > > > > > > > > > I have a local db setup, in which I have uploaded data using the > load_ncbi_taxonomy.pl script. > > > > > > > > > Thanks. > George > > > > > > > > > Jason Stajich wrote: > I assume you already figured out how to setup a local taxonomydb? > > > > > > > > > > > > > > > > > You just want the extant species/leaves of the tree > > > > > > > > > > > > > > > > > my @extant_children = grep { $_->is_Leaf } $node- > get_all_Descedents; > > > > > > > > > > > > > > > > > > > > > > > > > -jason > On Jun 17, 2007, at 11:41 AM, George Heller wrote: > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > Can anyone point me to some example that uses the > get_all_Descendents method from Bio::DB::Taxonomy? I am a > newbie at > this, and I am not quite sure how to implement it. > > > > > > > > > > > > > > > > > Thanks. > George > > > > > > > > > > > > > > > > > Sendu Bala wrote: > George Heller wrote: > Hi all, > > > > > > > > > > > > > > > > > I am looking at extracting the taxonomy hierarchy for some taxon > ids. > What I plan to do is, for a given taxon id, say 33090, I want to > extract all taxon ids that are children of this species. I do not > just want the immediate children, but the children's children > and so > on. > > > > > > > > > > > > > > > > > Any ideas on the way I can go about doing this? > > > > > > > > > > > > > > > > > Well, you'll use Bio::DB::Taxonomy presumably, and > each_Descendent in > some kind of looping structure. Most easily a recursing sub. > > > > > > > > > > > > > > > > > If you happen to code up something neat and efficient, why not > share it > with us and we could add it to the Taxonomy module(s). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Shape Yahoo! in your own image. Join our Network Research Panel > today! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Need a vacation? Get great deals to amazing places on Yahoo! > Travel. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Take the Internet to Go: Yahoo!Go puts the Internet in your > pocket: mail, news, photos & more. > > > > > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------- > Expecting? Get great news right away with email Auto-Check. > Try the Yahoo! Mail Beta. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > > > > > > > > --------------------------------- > Building a website is a piece of cake. > Yahoo! Small Business gives you all the tools to get online. > > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > > > > > > > --------------------------------- > Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s > user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From torsten.seemann at infotech.monash.edu.au Tue Jun 19 01:21:04 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:21:04 +1000 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4676A01F.30205@sendu.me.uk> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> Message-ID: Sendu, > >> Can anyone offer a > >> way to systematically find at least the test scripts which access the > >> internet, if not the specific tests within? Perhaps you could use 'strace' to list network system calls for each test script, and grep out AF_INET connections? % strace -e trace=network command_to_test 2>&1 | grep AF_INET I'm not an strace expert but it might do what you need. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University --Tel +61 3 9905 9010 From george.heller at yahoo.com Tue Jun 19 01:16:10 2007 From: george.heller at yahoo.com (George Heller) Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT) Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com> Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help! Thanks. George Jason Stajich wrote: The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index. You don't need to look at the files, they won't make sense to a human! The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call. You can either patch your code or just use the code listed here: http://bioperl.org/wiki/Module:Bio::DB::Taxonomy On Jun 18, 2007, at 5:29 PM, George Heller wrote: But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my $nodefile; my $namesfile; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodefile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } Thanks. George Jason Stajich wrote: All the children are in this array. You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen. Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object. I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature. my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; for my $child ( @extant_children ) { print "id is ", $child->id, "\n"; # NCBI taxa id print "rank is ", $child->rank, "\n"; # e.g. species print "scientific name is ", $child->scientific_name, "\n"; # scientific name } On Jun 18, 2007, at 5:04 PM, George Heller wrote: Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. Sorry to be bothering, really appreaciate your patience. Thanks. George Jason Stajich wrote: Try installing the latest Scalar::Util On Jun 18, 2007, at 4:05 PM, George Heller wrote: This is the output of /usr/bin/perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Jul 24 2006 18:28:10 @INC: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl Thanks. George . Hilmar Lapp wrote: The perl version appears to be 5.8.5 though, so something strange appears to be going on too. George, can you please post the output of $ /usr/bin/perl -V -hilmar On Jun 18, 2007, at 6:33 PM, Chris Fields wrote: As the error implies your local version of perl doesn't seem support weak references, which means it doesn't have Scalar::Utils (which was added to core after perl 5.6.1, I think). Try installing Scalar::Utils to see what happens. chris On Jun 18, 2007, at 5:18 PM, George Heller wrote: I tried running the below mentioned script and I seem to be getting the following error: Weak references are not implemented in the version of perl at / usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76 BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ Bio/Tree/Node.pm line 76. Compilation failed in require at my.pl line 7. BEGIN failed--compilation aborted at my.pl line 7. My script looks something like, #!/usr/bin/perl use strict; #use warnings; use DBI; use Bio::Tree::Node; use Bio::DB::Taxonomy; use Bio::DB::Taxonomy::flatfile; my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; foreach $field (@extant_children) { print "$field"; print "|"; print "\n"; } And I am running the script using the command, perl myscript.pl -v --names names.dmp --nodes nodes.dmp and I have the nodes.dmp and names.dmp files in the current directory. Thanks, George Jason Stajich wrote: It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile See the scripts/taxa/local_taxonomydb_query.PLS for example using it: nodes and names are from NCBI taxonomy database. Here is an un-debugged copy+paste for your question that *should* work. use Bio::DB::Taxonomy my $idx_dir = '/tmp'; my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp'); my $db = new Bio::DB::Taxonomy(-source => 'flatfile', -nodesfile => $nodesfile, -namesfile => $namesfile, -directory => $idx_dir); my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node- get_all_Descendents; -jason On Jun 18, 2007, at 10:07 AM, George Heller wrote: What exactly is the "node n" in the query below. When I issue this query, it says, relation "node" does not exist. I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line, shift->throw_not_implemented(); Thanks. George. Hilmar Lapp wrote: I'm a bit confused - it sounds like you have set up a local BioSQL database and loaded the NCBI taxonomy into the database. You can now use simple SQL to retrieve all descendants of a node in the tree given its NCBI taxonID such as SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n WHERE n.ncbi_taxon_id = :taxonID AND tn.left_value > n. left_value AND tn.right_value < n.right_value AND tn.taxon_id = tnm.taxon_id AND tn.name_class = 'scientific_name' BioPerl doesn't have a Taxonomy::biosql module yet (though this would seem like a worthwhile thing to add), so you can't use the Bio::DB::Taxonomy interface to do this against a BioSQL instance. However, BioPerl does have support for the flat-file download of the NCBI taxonomy database and indexes it, so you can simply use Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download to achieve what you wanted to do in a less than 5 lines of perl. Although the recursive implementation of Taxonomy::get_all_Descendants () won't be lightning fast, it may still be perfectly fine for your application - are you sure it is not? -hilmar On Jun 18, 2007, at 12:21 AM, George Heller wrote: Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. Thanks. George Jason Stajich wrote: I assume you already figured out how to setup a local taxonomydb? You just want the extant species/leaves of the tree my @extant_children = grep { $_->is_Leaf } $node- get_all_Descedents; -jason On Jun 17, 2007, at 11:41 AM, George Heller wrote: Hi all, Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. Thanks. George Sendu Bala wrote: George Heller wrote: Hi all, I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. Any ideas on the way I can go about doing this? Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in some kind of looping structure. Most easily a recursing sub. If you happen to code up something neat and efficient, why not share it with us and we could add it to the Taxonomy module(s). --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Need a vacation? Get great deals to amazing places on Yahoo! Travel. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== --------------------------------- Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ --------------------------------- Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. From torsten.seemann at infotech.monash.edu.au Tue Jun 19 01:26:41 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 19 Jun 2007 11:26:41 +1000 Subject: [Bioperl-l] gff2xml In-Reply-To: References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com> Message-ID: (Sean, please reply to the bioperl-l list rather than to me personally so everyone can read it. i'm reposting it here) > > I posted this on the gbrowse list earlier. I'm looking to convert gff > > data files into xml. Does anyone know of a module written to do this > > already? > > What DTD do you want the XML to conform to? > eg. ChadoXML, TinySeq XML, TIGR XML ... ? Hi Torsten, I'm collaborating with other groups and want web-service compatible functionality for various tools. Normally the analysis tools I'm using generate gff output. I'm going to have to wrap this output in XML with XSL stylesheet for end-users to view. Haven't done it before and don't know what DTD to use. The bp_seqconvert.pl doesn't accept gff format. I would imagine the DTD would be quite short as the gff files are very standard, I just don't have any experience with these DTD requirements. --Sean O'Keeffe From sac at bioperl.org Tue Jun 19 06:42:27 2007 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 18 Jun 2007 23:42:27 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> On 6/16/07, Jason Stajich wrote: > [...] > Just to say I already went through all the steps of running cvs2svn > myself and had problems gathering back out the branches and all the > tags when I tried it. If you want to start with a smaller repository > like bioperl-network or bioperl-db as the initial cvs2svn conversion > script took quite a long time to run on bioperl-live. Might this been a good opportunity to investigate partitioning bioperl-live into sub-repositories? There has been talk in the past of defining a set of "core" modules separate from other functionally related groups of modules that would be viewed as optional extensions. The goal being to help manage growth and simplify releases. There are currently 892 modules under Bio/. In addition to simplifying the migration to SVN, it would also have other benefits. Say some new functionality or a slew of fixes were added to Bio::Graphics. We could turn around a new Bio::Graphics release quickly without having to work on getting various other parts up to snuff that aren't related to graphics (Biblio, DB, PopGen, Search etc.). Maintenance and releases of the various extensions would be more parallelizable, orchestrated by separate ring leaders. Over time, as a set of functionality matures, it would see fewer updates and there would be less of a need for users to download/install/test it. This could make bioperl easier to customize, extend, and grok in general. Long term, it should ease development and release cycles, but it will involve a bit of near term bullet-biting. We'd need to get clear on how to partition things, including modules, tests, docs, installation logic, etc. and we'd probably need new integration tests to verify that the subsets continue working together. What do folks think? Would this SVN-based, re-partitioned bioperl-live constitute a 2.0 release? Any volunteers to help assemble a roadmap and milestones? Should I go on dreaming? Cheers, Steve From bix at sendu.me.uk Tue Jun 19 07:01:05 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:01:05 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> Message-ID: <46777F31.7030402@sendu.me.uk> Jason Stajich wrote: > The reason it isn't printing anything is someone didn't really write > the implementation quite right. This code was overhauled by Sendu > before the last release I guess something didn't quite get connected. > > I checked in code that has the Bio::Taxon delegating now to a DB > handle for the each_Descendent call. > You can either patch your code or just use the code listed here: > http://bioperl.org/wiki/Module:Bio::DB::Taxonomy I've reverted that change. For some reason the docs for Bio::Taxon::each_Descendent aren't showing up on the website, but they state: --- Note that this method never asks the database for the descendents; it will only return objects you have manually set with add_Descendent(), or where this was done for you by making a Bio::Tree::Tree with this object as an argument to new(). To get the database descendents use $taxon->db_handle->each_Descendent($taxon). --- I also have a note in the Synopsis for the module: --- # Though be careful with each_Descendent - unless you add_Descendent() # yourself, you won't get an answer because unlike for ancestor(), # Bio::Taxon does not ask the database for the answer. You can ask the # database yourself using the same method: ($human) = $homo->db_handle->each_Descendent($homo); --- This is quite deliberate and is to prevent Bad Things from happening. (Can't exactly remember the reasoning now, but I know it was good.) From bix at sendu.me.uk Tue Jun 19 07:41:57 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 08:41:57 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <467788C5.6070406@sendu.me.uk> Steve Chervitz wrote: > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? There has been talk in the past of > defining a set of "core" modules separate from other functionally > related groups of modules that would be viewed as optional extensions. > The goal being to help manage growth and simplify releases. There are > currently 892 modules under Bio/. > > In addition to simplifying the migration to SVN, it would also have > other benefits. Say some new functionality or a slew of fixes were > added to Bio::Graphics. We could turn around a new Bio::Graphics > release quickly without having to work on getting various other parts > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > Search etc.). Maintenance and releases of the various extensions would > be more parallelizable, orchestrated by separate ring leaders. > > Over time, as a set of functionality matures, it would see fewer > updates and there would be less of a need for users to > download/install/test it. This could make bioperl easier to customize, > extend, and grok in general. > > Long term, it should ease development and release cycles I actually take the opposite view. Breaking things up makes testing and releases more difficult. If one person acts as pumpkin for all the sub-parts, his work-load increases almost linearly with the number of sub-parts. If each sub-part gets its own pumpkin, where do all these pumpkins come from? It seems to me that frequently authors will write modules but inevitably their circumstance changes and they can no longer devote the time to look after them. Having a single pumpkin and 'forcing' him to make sure everything works (regardless of his personal interest in the module) seems more reliable than hoping there will be a person interested enough in each sub-part to handle its release. Since all sub-parts will at the least interact with the 'true' core set of Bioperl modules, they need to be tested and potentially re-released every time the true core is updated. And since some sub-parts will interact with other sub-parts, there will need to be coordinated joint-testing and release of multiple sub-parts. What happens when users report problems? We ask them what version they're running. Right now '1.5.2' means a specific thing, and its trivial for someone to confirm the same problem by installing 1.5.2. What happens when users have to list out all the versions of all the sub-parts they have? Who is going to consistently recreate a users hodge-podge of versions in order to confirm a bug? Won't the advice instead be: "update all versions to the latest and get back to us"? So, as I see it, all sub-parts would best be tested and released with a single new version number every time one sub-part is updated (significantly). In which case, why have sub-parts at all? Keeping things the way they are now means ease of release for the pumpkin and ease of installation for end-users (only one install command to issue to CPAN). Having 'true' sub-parts (each with its own pumpkin), in my fatalistic view, is just going to lead to some useful sub-parts being abandoned and never updated, even where updates may be desirable. Each and every Bio:: module could have been released separately by its respective author. As I see it, one of the main values of 'Bioperl' is that its one (reasonably) consistent collection of modules that lowers the barrier of entry for new Bioinformaticians, giving them extremely easy access to a whole host of functionality with a single install. From hlapp at gmx.net Tue Jun 19 12:47:02 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 08:47:02 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <46777F31.7030402@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> So the real mistake was to write my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; instead of my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents ($node); I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the database? If this is correct, can we highlight this in the documentation? It's a small difference that everyone failed to spot. If it is not correct, then maybe we need to revisit the rationale for why a Bio::DB::Taxonomy::get_all_Descendents may not query the underlying database. Also, in my reading of Bio::Taxonomy::Taxon it won't use the database either for ancestor(). Which would be consistent with its other methods. I.e., the bottom line is don't use Node or Taxon objects for hierarchy queries that you expect to use an underlying database, use the Bio::DB::Taxonomy object instead. It makes sense, but is it true? -hilmar On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote: > Jason Stajich wrote: >> The reason it isn't printing anything is someone didn't really write >> the implementation quite right. This code was overhauled by Sendu >> before the last release I guess something didn't quite get connected. >> >> I checked in code that has the Bio::Taxon delegating now to a DB >> handle for the each_Descendent call. >> You can either patch your code or just use the code listed here: >> http://bioperl.org/wiki/Module:Bio::DB::Taxonomy > > I've reverted that change. > > For some reason the docs for Bio::Taxon::each_Descendent aren't > showing > up on the website, but they state: > > --- > Note that this method never asks the database for the descendents; it > will only return objects you have manually set with add_Descendent > (), or > where this was done for you by making a Bio::Tree::Tree with this > object > as an argument to new(). > > To get the database descendents use > $taxon->db_handle->each_Descendent($taxon). > --- > > > I also have a note in the Synopsis for the module: > > --- > # Though be careful with each_Descendent - unless you add_Descendent() > # yourself, you won't get an answer because unlike for ancestor(), > # Bio::Taxon does not ask the database for the answer. You can ask the > # database yourself using the same method: > ($human) = $homo->db_handle->each_Descendent($homo); > --- > > > This is quite deliberate and is to prevent Bad Things from happening. > (Can't exactly remember the reasoning now, but I know it was good.) > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From rvos at interchange.ubc.ca Tue Jun 19 13:05:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT) Subject: [Bioperl-l] SVN and ...Re: Perltidy Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca> > Unrelated, but it randomly just occurred to me: what happens to all the > id lines at the top of modules? Eg: > > $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $ > > That's a cvs-specific thing, right? Do we delete them all? (Regardless, > I wish we would, since they caused me no end of hassles during the 1.5.2 > release, doing updates across branches.) If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated. From bix at sendu.me.uk Tue Jun 19 14:25:26 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 15:25:26 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> Message-ID: <4677E756.6050200@sendu.me.uk> Hilmar Lapp wrote: > So the real mistake was to write > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents; > > instead of > > my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); > my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents > ($node); > > I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the > database? Yes, the database object methods use the database. I don't even think it makes sense to question that. What else would it do? > If this is correct, can we highlight this in the documentation? It's > a small difference that everyone failed to spot. The documentation for what? I've already clearly pointed out the gotcha in Bio::Taxon. > Also, in my reading of Bio::Taxonomy::Taxon it won't use the database > either for ancestor(). Which would be consistent with its other methods. Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing with, and it /does/ use the db to get the ancestor, unless the ancestor is manually set (see below for explanation). > I.e., the bottom line is don't use Node or Taxon objects for > hierarchy queries that you expect to use an underlying database, use > the Bio::DB::Taxonomy object instead. It makes sense, but is it true? Almost. It happens to be true but ideally wouldn't be the case. The confusion and problems arise, I guess, because we have two ways to access/create hierarchies and both of them are built from the same building block (Bio::Taxon objects). On the one hand we have Bio::DB::Taxonomy and the other we have Bio::Tree::Tree. Tree objects are easy: you have a Taxon object created in memory for each and every node in the tree. Each Taxon knows its ancestor and descendants by storing references to the relevant Taxon objects in the tree. You 'navigate' through the tree by grabbing a Taxon inside it and asking the Taxon itself for its ancestor or descendant. This leaves us with the Taxon object having the methods ancestor() and each_Descendent(), which we'll expect to work in other circumstances. Bio::DB::Taxonomy returns single Taxon objects from the database on request. Now we still expect our ancestor() and each_Descendent() methods to work, but if things were set up like Bio::Tree::Tree we'd end up pulling the entire database into memory because we'd have to create all the Taxon objects that are ancestors and descendants, recursively, every time we request a single Taxon (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of Bio::DB::Taxonomy::entrez). The solution? We simply don't create the immediate ancestor or descendant Taxon objects of the requested Taxon, and instead implement the Taxon methods to ask the database to create them on demand, if they don't already exist. Well, that idea is fine (and necessary) for the ancestor method, but we run into problems with each_Descendent(). The problem arises when we create Bio::Tree::Tree objects from a Taxon we got from the database. Being able to do that is why Bio::Taxon is shared between them, as it is a very desirable thing to do: you can instantly create a lineage tree for a Taxon of interest and then use all the Bio::Tree::Tree methods on it. Unfortunately one of those methods is get_nodes() which is implemented using each_Descendent() and get_all_Descendents(). If each_Descendent() asked the database for the real answer, we'd end up pulling the entire database into the tree. So my implementation was to not ask the database and just warn people in the docs. Ideally it /would/ use the database, because that's what a user would expect. Can anyone see an alternate way around the problem? From hlapp at gmx.net Tue Jun 19 16:14:38 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 12:14:38 -0400 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: <4677E756.6050200@sendu.me.uk> References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: Sorry I was accidentally looking at an older branch. Reading through the Taxon module I get more confused though than would leave me at ease. Here's what I understand of your description of the problem: - We would like nodes returned from Bio::DB::Taxonomy to use the database for all hierarchical queries. - We would like nodes used in a Bio::Tree::Tree not to use the database for any hierarchical query. What I understand that we have is - Taxon node objects that have a db_handle set will use the database for ancestor(), unless it has been set manually (?), but not for each_Descendent(). - Taxon node objects that don't have a db_handle set won't use a database but will function normally otherwise. - This is needed to prevent Bio::Tree::Tree methods from pulling the entire tree into memory. If this is correct (I'm not sure it is), it sounds like we want to temporarily divorce taxonomy nodes from their database capabilities while they are being queried in a tree context? I'm still trying to understand - if I create a Bio::Tree::Tree from a single node, will the tree automatically contain all nodes along the lineage of ancestors up to the root? So, even if extracting this lineage involved querying a database it would be acceptable, but not for querying descendents? It sounds to me like what is needed is that nodes that get added to a tree need to be stripped of their database capabilities. This could be achieved by creating a wrapper class that delegates all non- hierarchical methods to the wrapped Taxon object, and overriding all hierarchical queries to not use a database. I'm not sure I fully understand yet though, but the inconsistent behavior will be sure to throw people off track. -hilmar On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> So the real mistake was to write >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $node- >> >get_all_Descendents; >> instead of >> my $node = $db->get_Taxonomy_Node(-taxonid => '33090'); >> my @extant_children = grep { $_->is_Leaf } $db- >> >get_all_Descendents ($node); >> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask >> the database? > > Yes, the database object methods use the database. I don't even > think it makes sense to question that. What else would it do? > > >> If this is correct, can we highlight this in the documentation? >> It's a small difference that everyone failed to spot. > > The documentation for what? I've already clearly pointed out the > gotcha in Bio::Taxon. > > >> Also, in my reading of Bio::Taxonomy::Taxon it won't use the >> database either for ancestor(). Which would be consistent with >> its other methods. > > Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're > dealing with, and it /does/ use the db to get the ancestor, unless > the ancestor is manually set (see below for explanation). > > >> I.e., the bottom line is don't use Node or Taxon objects for >> hierarchy queries that you expect to use an underlying database, >> use the Bio::DB::Taxonomy object instead. It makes sense, but is >> it true? > > Almost. It happens to be true but ideally wouldn't be the case. The > confusion and problems arise, I guess, because we have two ways to > access/create hierarchies and both of them are built from the same > building block (Bio::Taxon objects). > > On the one hand we have Bio::DB::Taxonomy and the other we have > Bio::Tree::Tree. > > Tree objects are easy: you have a Taxon object created in memory > for each and every node in the tree. Each Taxon knows its ancestor > and descendants by storing references to the relevant Taxon objects > in the tree. You 'navigate' through the tree by grabbing a Taxon > inside it and asking the Taxon itself for its ancestor or descendant. > > This leaves us with the Taxon object having the methods ancestor() > and each_Descendent(), which we'll expect to work in other > circumstances. > > Bio::DB::Taxonomy returns single Taxon objects from the database on > request. Now we still expect our ancestor() and each_Descendent() > methods to work, but if things were set up like Bio::Tree::Tree > we'd end up pulling the entire database into memory because we'd > have to create all the Taxon objects that are ancestors and > descendants, recursively, every time we request a single Taxon > (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and > slow/not allowed in the case of Bio::DB::Taxonomy::entrez). > > The solution? We simply don't create the immediate ancestor or > descendant Taxon objects of the requested Taxon, and instead > implement the Taxon methods to ask the database to create them on > demand, if they don't already exist. Well, that idea is fine (and > necessary) for the ancestor method, but we run into problems with > each_Descendent(). > > The problem arises when we create Bio::Tree::Tree objects from a > Taxon we got from the database. Being able to do that is why > Bio::Taxon is shared between them, as it is a very desirable thing > to do: you can instantly create a lineage tree for a Taxon of > interest and then use all the Bio::Tree::Tree methods on it. > Unfortunately one of those methods is get_nodes() which is > implemented using each_Descendent() and get_all_Descendents(). If > each_Descendent() asked the database for the real answer, we'd end > up pulling the entire database into the tree. > > So my implementation was to not ask the database and just warn > people in the docs. Ideally it /would/ use the database, because > that's what a user would expect. Can anyone see an alternate way > around the problem? -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cain.cshl at gmail.com Tue Jun 19 18:41:52 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Tue, 19 Jun 2007 14:41:52 -0400 Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug? In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL> References: <18039.61086.829726.809888@gargle.gargle.HOWL> Message-ID: <1182278512.2592.42.camel@localhost.localdomain> Hi Alessandra, I cc'ed your message to the bioperl and sequence ontology mailing lists, since your question is relevant to both. Converting genbank files to GFF3 is excruciatingly difficult; I generally find that I can use the genbank2gff3 script to get me most of the way there, but then I need to do some manual fixing to make it 'right'. I am using bioperl-live, since there have been several fixes to the script since bioperl 1.5.2 was released, including the most recent fixes from me today (when I started working on this); I would suggest you use bioperl-live as well. I ran the script on chrY. Most (perhaps all) of the errors fit into a few categories: - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have a phase. Since it can be a little bit of a hassle to calculate, I understand why it was left out, but I'll submit a bug report to have those calculated. If you are planning on loading the GFF file into Chado, you can use the --noCDS option to get exons instead of CDSes, which makes the problem go away (the validator has a bug here though--it reports the polypeptide derives_from mRNA as invalid, but it is correct; I'm reporting that directly to the author). Here's the bioperl bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2322 - "invalid type pair" is caused by the genbank file using feature types in a way that conflicts with the Sequence Ontology. For example, it has STS features that are part_of a gene, pseudogenic_region as part_of pseudogene. I don't know if there would be an easy way to catch this in the conversion script. You may need to fix these by hand. If the problems occur for features that you don't care about, you can use the --filter option to leave them out of the resulting GFF file (for example, adding '--filter STS' would leave all STS features out of the file). Also, if you don't plan on loading these into Chado (which does require SO-compliance) but instead plan on using a Bio::DB::SeqFeature database, these errors may not be a problem. - "invalid type" is caused by feature types that are not in SOFA (Sequence Ontology for Feature Annotation), though the terms probably are in SO. I thought at one point we discussed allowing any SO type to appear in the GFF3 type column, but that is not what the spec says now. I don't see this type of error as causing a problem for either Bio::DB::SeqFeature or Chado. Chado allows features to be typed with anything that is in SO and does not restrict to SOFA. Scott On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote: > Hi all, > > I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about > human genbank file. I used validate_gff3 on line with human.gff and > it has id non-unique so the database gbrowse inserting has errors. > > I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that > I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens > Elements having id non-unique are: > - CDS or pseudo*exon without mRNA and parent > - STS with egual start and end > - tRNA with egual name > > If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl? > If I'm mistaken, can you help me? > > Thanks very much for the help in advance, > > Alessandra. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sac at bioperl.org Tue Jun 19 18:54:39 2007 From: sac at bioperl.org (Steve Chervitz) Date: Tue, 19 Jun 2007 11:54:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <467788C5.6070406@sendu.me.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Valid points, Sendu. I wonder if there might be a best-of-both-worlds approach here. I would not be advocating for a major slice and dice, but just identifying a few large, reasonably well established and encapsulated blocks of functionality that could be managed more independently and segregating them away from the rest. For example: DB, Graphics, Search+SearchIO, Tools. Once per year, we could have a "whole caboodle" release where the core and all sub parts are tested and released as a group, as we currently do. Then, updates to the sub parts can occur as-needed but without necessarily involving updates to other sub parts or the core. The onus would be on the pumpkin for the sub part release to make sure it continues to work with the last whole caboodle release. This would minimize the number of release clashes, since sub part updates would only be sanctioned relative to the last caboodle release, and it would ensure that the whole set continues to interoperate. Perhaps it would be worth experimenting with such an approach so we can judge it based on actual experience. We could identify one functional sub part and segregate it out, do a release cycle or two, along with a sub part release, and decide if this makes things easier or harder, for devs as well as users. We could always bring it back into the fold if it doesn't work out. My fear is that as bioperl continues to grow, the monolithic approach will become increasingly onerous for a single release pumpkin to manage, and harder to find someone who feels up to the task. It could also discourage new developers from diving into the codebase if it looks too deep. And they are our lifeblood. A more functionally segregated bioperl codebase could lower the activation energy needed to recruit release pumpkins and new devs, leading to more release iterations, fewer bugs, more features, and more sustainable growth. When I first discovered Bioperl in 1996, it had three modules. At ~900, I probably wouldn't have joined ranks as a developer (well, I probably would, but it would have taken a while to digest it and become a contributor). Steve On 6/19/07, Sendu Bala wrote: > Steve Chervitz wrote: > > Might this been a good opportunity to investigate partitioning > > bioperl-live into sub-repositories? There has been talk in the past of > > defining a set of "core" modules separate from other functionally > > related groups of modules that would be viewed as optional extensions. > > The goal being to help manage growth and simplify releases. There are > > currently 892 modules under Bio/. > > > > In addition to simplifying the migration to SVN, it would also have > > other benefits. Say some new functionality or a slew of fixes were > > added to Bio::Graphics. We could turn around a new Bio::Graphics > > release quickly without having to work on getting various other parts > > up to snuff that aren't related to graphics (Biblio, DB, PopGen, > > Search etc.). Maintenance and releases of the various extensions would > > be more parallelizable, orchestrated by separate ring leaders. > > > > Over time, as a set of functionality matures, it would see fewer > > updates and there would be less of a need for users to > > download/install/test it. This could make bioperl easier to customize, > > extend, and grok in general. > > > > Long term, it should ease development and release cycles > > I actually take the opposite view. Breaking things up makes testing and > releases more difficult. > > If one person acts as pumpkin for all the sub-parts, his work-load > increases almost linearly with the number of sub-parts. If each sub-part > gets its own pumpkin, where do all these pumpkins come from? It seems to > me that frequently authors will write modules but inevitably their > circumstance changes and they can no longer devote the time to look > after them. Having a single pumpkin and 'forcing' him to make sure > everything works (regardless of his personal interest in the module) > seems more reliable than hoping there will be a person interested enough > in each sub-part to handle its release. > > Since all sub-parts will at the least interact with the 'true' core set > of Bioperl modules, they need to be tested and potentially re-released > every time the true core is updated. And since some sub-parts will > interact with other sub-parts, there will need to be coordinated > joint-testing and release of multiple sub-parts. > > What happens when users report problems? We ask them what version > they're running. Right now '1.5.2' means a specific thing, and its > trivial for someone to confirm the same problem by installing 1.5.2. > What happens when users have to list out all the versions of all the > sub-parts they have? Who is going to consistently recreate a users > hodge-podge of versions in order to confirm a bug? Won't the advice > instead be: "update all versions to the latest and get back to us"? > > So, as I see it, all sub-parts would best be tested and released with a > single new version number every time one sub-part is updated > (significantly). In which case, why have sub-parts at all? Keeping > things the way they are now means ease of release for the pumpkin and > ease of installation for end-users (only one install command to issue to > CPAN). Having 'true' sub-parts (each with its own pumpkin), in my > fatalistic view, is just going to lead to some useful sub-parts being > abandoned and never updated, even where updates may be desirable. > > Each and every Bio:: module could have been released separately by its > respective author. As I see it, one of the main values of 'Bioperl' is > that its one (reasonably) consistent collection of modules that lowers > the barrier of entry for new Bioinformaticians, giving them extremely > easy access to a whole host of functionality with a single install. > From bix at sendu.me.uk Tue Jun 19 19:13:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:13:39 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <46782AE3.2090703@sendu.me.uk> Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. [snip] You haven't convinced me, but I'd go along with the majority decision if best-of-both-worlds was picked. > DB, Graphics, Search+SearchIO, Tools. I will, however, say that DB interleaves into too many core modules. It should stay in core. Tools? Its hardly touched anyway, so I don't see the value of taking it out, what with Bio::Tools::Run already being its own package. Most Bioperl users probably get Bioperl just to do something Blast related, so all Blast stuff really ought to stay in core. Graphics is an obvious choice and I agree. Updated frequently, and has its own release needs. It also has some of the trickier dependencies, so would make installing core simpler. I can imagine plucking Search+SearchIO out, and its something that needs regular updating. Another good candidate. > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. Well, we already have the run package. Its a split-off subpart that gets updated. The only 'experiment' left to do is finding it its own pumpkin. From bix at sendu.me.uk Tue Jun 19 19:48:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 19 Jun 2007 20:48:50 +0100 Subject: [Bioperl-l] Taxonomy hierarchy extraction In-Reply-To: References: <369098.81077.qm@web56507.mail.re3.yahoo.com> <46777F31.7030402@sendu.me.uk> <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net> <4677E756.6050200@sendu.me.uk> Message-ID: <46783322.30309@sendu.me.uk> Hilmar Lapp wrote: > Here's what I understand of your description of the problem: > > - We would like nodes returned from Bio::DB::Taxonomy to use the > database for all hierarchical queries. > > - We would like nodes used in a Bio::Tree::Tree not to use the > database for any hierarchical query. Correct. > What I understand that we have is > > - Taxon node objects that have a db_handle set will use the database > for ancestor(), unless it has been set manually (?), but not for > each_Descendent(). > > - Taxon node objects that don't have a db_handle set won't use a > database but will function normally otherwise. > > - This is needed to prevent Bio::Tree::Tree methods from pulling the > entire tree into memory. Correct. > If this is correct (I'm not sure it is), it sounds like we want to > temporarily divorce taxonomy nodes from their database capabilities > while they are being queried in a tree context? Yes. > I'm still trying to understand - if I create a Bio::Tree::Tree from a > single node, will the tree automatically contain all nodes along the > lineage of ancestors up to the root? So, even if extracting this > lineage involved querying a database it would be acceptable, but not > for querying descendents? Yes. Asking the database for all the ancestors up to root only pulls a couple of nodes into the tree and is exactly what the user would want to happen. But if nodes are allowed to get their descendants from the database, when we get the root node from the database, we'd get all the root's descendants, and then for each of those we'd get all /their/ descendants... that's when the whole db gets sucked in. > It sounds to me like what is needed is that nodes that get added to a > tree need to be stripped of their database capabilities. This could > be achieved by creating a wrapper class that delegates all non- > hierarchical methods to the wrapped Taxon object, and overriding all > hierarchical queries to not use a database. I'm not sure I fully > understand yet though, but the inconsistent behavior will be sure to > throw people off track. When we're making a tree from a db Taxon we need db access to find all the ancestors; we just don't want to get any descendants outside our initiating Taxon's direct lineage. my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens'); my @ranks = qw(superkingdom class order genus species); my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names, -ranks => \@ranks); @names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus'); $db->add_lineage(-names => \@names, -ranks => \@ranks); my $homo = $db->get_taxon(-name => 'Homo'); isa_ok($homo, 'Bio::Taxon'); # PASS is $homo->ancestor->scientific_name, 'Primates' # PASS my @descs = $homo->each_Descendent; is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node my $lineage = Bio::Tree::Tree->new(-node => $homo); is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS my @nodes = $lineage->get_nodes; ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8 (on that last test I can't remember if the answer might actually be 5 because our lineage does contain 'Homo sapiens') If anyone can figure out how to get all those to pass, please let me know. From cjfields at uiuc.edu Tue Jun 19 21:15:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 16:15:00 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote: > Valid points, Sendu. I wonder if there might be a best-of-both-worlds > approach here. I would not be advocating for a major slice and dice, > but just identifying a few large, reasonably well established and > encapsulated blocks of functionality that could be managed more > independently and segregating them away from the rest. For example: > DB, Graphics, Search+SearchIO, Tools. There should also be a consensus between the core devs on this; I don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing their opinions as it will directly impact projects which rely on core functionality (GBrowse/GMOD, bioperl-db, etc). I also agree with George that this should be postponed until after svn issues are taken care of. Stating that, I think this is a good idea in general, though we'll need to be careful which ones we segregate out as non-core. I agree with your choices; I would add in Bio::Restriction, Bio::Assembly, Bio::Structure, and a few more. As long as the distribution required installation of 'core' prior to test runs it shouldn't be too much of a problem. In order for this to work we would need to delineate what defines 'core' (how broad the definition should be), then identify those modules that don't fit and decide what to do with them. Would we want to split the others into separate packages or lump together as a bioperl-auxiliary (horrid name, but you get my point)? Too many could be a logistical nightmare, as Sendu has pointed out. > Once per year, we could have a "whole caboodle" release where the core > and all sub parts are tested and released as a group, as we currently > do. Then, updates to the sub parts can occur as-needed but without > necessarily involving updates to other sub parts or the core. Sounds fine by me. Actually, my thought was we could reimplement Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted) to install all the necessary subpackages in order to emulate an old- style 'core' installation, or act as an 'install everything BioPerl- related' Bundle. Regular updates of the subpackages to CPAN should just require updating the Bundle (which would update only the relevant parts, at least I believe it would). > The onus would be on the pumpkin for the sub part release to make sure > it continues to work with the last whole caboodle release. This would > minimize the number of release clashes, since sub part updates would > only be sanctioned relative to the last caboodle release, and it would > ensure that the whole set continues to interoperate. > > Perhaps it would be worth experimenting with such an approach so we > can judge it based on actual experience. We could identify one > functional sub part and segregate it out, do a release cycle or two, > along with a sub part release, and decide if this makes things easier > or harder, for devs as well as users. We could always bring it back > into the fold if it doesn't work out. > > My fear is that as bioperl continues to grow, the monolithic approach > will become increasingly onerous for a single release pumpkin to > manage, and harder to find someone who feels up to the task. It could > also discourage new developers from diving into the codebase if it > looks too deep. And they are our lifeblood. Agreed! > A more functionally segregated bioperl codebase could lower the > activation energy needed to recruit release pumpkins and new devs, > leading to more release iterations, fewer bugs, more features, and > more sustainable growth. 'Activation energy.' Hmm. Spoken like a true biologist. > When I first discovered Bioperl in 1996, it had three modules. At > ~900, I probably wouldn't have joined ranks as a developer (well, I > probably would, but it would have taken a while to digest it and > become a contributor). > > Steve I pretty much agree, though this will require quite a bit more discussion. chris From hlapp at gmx.net Tue Jun 19 21:57:54 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 19 Jun 2007 17:57:54 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > There should also be a consensus between the core devs on this; I > don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing > their opinions The problem I have increasingly had with BioPerl (aside from the fact that it's written in Perl ;) is the plethora of dependencies I need to install, not the number of modules. But every time I've been told that that's what Perl is all about, and I should shut up and install the bundle. Idiosyncratically I don't like bundles that clutter up my hard disk with stuff I'll never use, and in this sense if BioPerl is divided into 10 packages I will have to think about each one whether I need it, and do a separate CVS checkout - and regular update - of each one (though granted, I believe there are ways the multiple checkout and update thing can be taken care of). In reality, this may be a rapidly disappearing trait though of those who have grown up in a time when they proudly spent all their savings to buy that new computer because it had a 20MB hard disk, compared to the two 360k floppy drives the previous one had. So don't ask me, just don't make it too hard for the dinosaurs. > as it will directly impact projects which rely on core > functionality (GBrowse/GMOD, bioperl-db, etc). Well, I hope there are ways to limit that? > I also agree with George that this should be postponed until after > svn issues are taken care of. I agree entirely. Please don't throw this in the same bin or tie one to the other. The migration is neither easier nor faster nor better testable with a partitioned BioPerl. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 20 01:48:20 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 19 Jun 2007 20:48:20 -0500 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: > >> There should also be a consensus between the core devs on this; I >> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >> their opinions > > The problem I have increasingly had with BioPerl (aside from the fact > that it's written in Perl ;) is the plethora of dependencies I need > to install, not the number of modules. > > But every time I've been told that that's what Perl is all about, and > I should shut up and install the bundle. Idiosyncratically I don't > like bundles that clutter up my hard disk with stuff I'll never use, > and in this sense if BioPerl is divided into 10 packages I will have > to think about each one whether I need it, and do a separate CVS > checkout - and regular update - of each one (though granted, I > believe there are ways the multiple checkout and update thing can be > taken care of). I agree; the fewer dependencies the better. We could divide it up into a small, focused core package with only a few dependencies, and 1-3 more containing the focused bits which require the most maintenance (Graphics, SearchIO/Tools, etc). I worry about having too many more. > In reality, this may be a rapidly disappearing trait though of those > who have grown up in a time when they proudly spent all their savings > to buy that new computer because it had a 20MB hard disk, compared to > the two 360k floppy drives the previous one had. > > So don't ask me, just don't make it too hard for the dinosaurs. There would need to be some way of getting an old-style full-blown core installation regardless of how many subdistros we would divy core up into. My thought for CPAN was having Bundle::BioPerl take over this but I'm not sure if it's still being used. Maybe there are other ways for svn/cvs. >> as it will directly impact projects which rely on core >> functionality (GBrowse/GMOD, bioperl-db, etc). > > Well, I hope there are ways to limit that? I believe so, yes, particularly for bioperl-db. I would think splitting off Bio::Graphics or Bio::DB* will have some effect on GBrowse/GFF. >> I also agree with George that this should be postponed until after >> svn issues are taken care of. > > I agree entirely. Please don't throw this in the same bin or tie one > to the other. The migration is neither easier nor faster nor better > testable with a partitioned BioPerl. > > -hilmar We def. have to complete transition to subversion first, then think about this some more. chris From n.haigh at sheffield.ac.uk Wed Jun 20 06:31:24 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 07:31:24 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> Message-ID: <4678C9BC.10206@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote: > >> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote: >> >>> There should also be a consensus between the core devs on this; I >>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing >>> their opinions >> The problem I have increasingly had with BioPerl (aside from the fact >> that it's written in Perl ;) is the plethora of dependencies I need >> to install, not the number of modules. >> >> But every time I've been told that that's what Perl is all about, and >> I should shut up and install the bundle. Idiosyncratically I don't >> like bundles that clutter up my hard disk with stuff I'll never use, >> and in this sense if BioPerl is divided into 10 packages I will have >> to think about each one whether I need it, and do a separate CVS >> checkout - and regular update - of each one (though granted, I >> believe there are ways the multiple checkout and update thing can be >> taken care of). > > I agree; the fewer dependencies the better. We could divide it up > into a small, focused core package with only a few dependencies, and > 1-3 more containing the focused bits which require the most > maintenance (Graphics, SearchIO/Tools, etc). I worry about having > too many more. > >> In reality, this may be a rapidly disappearing trait though of those >> who have grown up in a time when they proudly spent all their savings >> to buy that new computer because it had a 20MB hard disk, compared to >> the two 360k floppy drives the previous one had. >> >> So don't ask me, just don't make it too hard for the dinosaurs. > > There would need to be some way of getting an old-style full-blown > core installation regardless of how many subdistros we would divy > core up into. My thought for CPAN was having Bundle::BioPerl take > over this but I'm not sure if it's still being used. Maybe there are > other ways for svn/cvs. Personally, I think this use of Bundle::Bioperl is more in line with what CPAN Bundles were meant to do - "a bundle is a collection of modules that comprise a cohesive unit". Under that definition you could probably put the whole of Bioperl but I won't go there! When a package is updated and a new release is made, this should be installable/updatable via cpan as well as updating the bundle with the correct version. This was you can get all of Bioperl via the bundle, or just install the sub-packages on their own. If the switch over to svn takes place, will all the Bioperl-* projects move over at the same time? If so, will they go into their own svn repository or into the same one? Since with svn you can checkout any subtree of the repository I'm not clear on the pro's and cons of either of these options. Am I right in thinking that there is a way for cvs to define a "project" such that when you checkout that "project" it actually checks out multiple projects behind the scene? I'm sure I've seen this somewhere, possibly when the project is dependent on some 3rd party code that is also in cvs. If this is possible, I'm sure it will also be possible with svn. This could then allow something like the following to happen after the split up of Bioperl. The following projects could be defined: bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" called "bioperl" would actually checkout the real projects call bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems that this ought to be possible, doesn't it? > >>> as it will directly impact projects which rely on core >>> functionality (GBrowse/GMOD, bioperl-db, etc). >> Well, I hope there are ways to limit that? > > I believe so, yes, particularly for bioperl-db. I would think > splitting off Bio::Graphics or Bio::DB* will have some effect on > GBrowse/GFF. > >>> I also agree with George that this should be postponed until after >>> svn issues are taken care of. >> I agree entirely. Please don't throw this in the sam. e bin or tie one >> to the other. The migration is neither easier nor faster nor better >> testable with a partitioned BioPerl. >> >> -hilmar > > We def. have to complete transition to subversion first, then think > about this some more. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4 op9sQTZyeK6G6taFhTAPMYc= =7NRw -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 11:46:16 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 07:46:16 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? They are under the same CVSROOT right now. Locking down some sub- repositories but not others may be odd or impossible. > If so, will they go into their own svn repository or into the same > one? Good question, I'm not sure about the pros and cons one way or the other either. The fewer repositories the less sysadmin work in fine- graining permissions. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8 Ims4d150lsX0vXtDwGI1lKg= =K4++ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Wed Jun 20 11:57:22 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 12:57:22 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> Message-ID: <46791622.6080409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > >> If the switch over to svn takes place, will all the Bioperl-* projects >> move over at the same time? > > They are under the same CVSROOT right now. Locking down some > sub-repositories but not others may be odd or impossible. > >> If so, will they go into their own svn repository or into the same one? > > Good question, I'm not sure about the pros and cons one way or the other > either. The fewer repositories the less sysadmin work in fine-graining > permissions. > > -hilmar > I don't think there is any major reason why the following single repos wouldn't do the trick: /-- |-bioperl-live | |--- trunk | |--- branches | |--- tags | |-bioperl-run |--- trunk |--- branches |--- tags Any reason why this couldn't be used? I know some people don't like the idea of the revision number incrementing for the whole repository if it contains several "projects". However, revision numbers are really only a way for svn to keep track of things and a very large revision number shouldn't really "upset" anyone. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA 1Vj8BSUnanpdjYYLE6eGanU= =bOqK -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 20 12:08:33 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 08:08:33 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? That would work fine except that there are several more sub-projects (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). That should still be fine. I think what needs to be recognized is the limitations it puts on permission granularity. If it's all the same repository (as is now) then having commit rights to one (subproject) will mean commit rights to all. From my perspective that's fine, it has worked great so far. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1 hckjT7LBtHcmwGI8B+BKQIM= =gYfA -----END PGP SIGNATURE----- From hartzell at alerce.com Tue Jun 19 19:53:39 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 19 Jun 2007 12:53:39 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> Message-ID: <18040.13379.217277.992742@almost.alerce.com> Steve Chervitz writes: > On 6/16/07, Jason Stajich wrote: > > [...] > > Just to say I already went through all the steps of running cvs2svn > > myself and had problems gathering back out the branches and all the > > tags when I tried it. If you want to start with a smaller repository > > like bioperl-network or bioperl-db as the initial cvs2svn conversion > > script took quite a long time to run on bioperl-live. > > Might this been a good opportunity to investigate partitioning > bioperl-live into sub-repositories? [...] I'd say that the time to do this kind of rearrangement would be *after* the svn repo's set up. That way you'll be able to track stuff back through to the beginning of time. g. From sdavis2 at mail.nih.gov Wed Jun 20 12:44:08 2007 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 20 Jun 2007 08:44:08 -0400 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <46792118.4030205@mail.nih.gov> Hilmar Lapp wrote: > > On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote: > >> I don't think there is any major reason why the following single repos >> wouldn't do the trick: > >> /-- >> |-bioperl-live >> | |--- trunk >> | |--- branches >> | |--- tags >> | >> |-bioperl-run >> |--- trunk >> |--- branches >> |--- tags > >> Any reason why this couldn't be used? > > That would work fine except that there are several more sub-projects > (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more). > > That should still be fine. I think what needs to be recognized is the > limitations it puts on permission granularity. If it's all the same > repository (as is now) then having commit rights to one (subproject) > will mean commit rights to all. From my perspective that's fine, it > has worked great so far. Actually, I think there are ways of creating per-directory access control. See here: http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general With Apache-based https access, such access control is relatively straightforward, it appears. With the standalone svn server over ssh, one needs to use "commit hook scripts" to limit access. But I think it is possible (admitting that I have not tried to do this...). Sean From hartzell at alerce.com Wed Jun 20 13:23:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:23:32 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <4678C9BC.10206@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> Message-ID: <18041.10836.728079.835572@almost.alerce.com> Nathan S. Haigh writes: > [...] > If the switch over to svn takes place, will all the Bioperl-* projects > move over at the same time? If so, will they go into their own svn > repository or into the same one? Since with svn you can checkout any > subtree of the repository I'm not clear on the pro's and cons of either > of these options. I'm planning to drop the projects from the top of the CVSROOT into a single svn repository: bioperl-ext bioperl-pipeline biodata bioperl-gui bioperl-run bioperl-cookbook bioperl-live biosql-schema bioperl-corba-client bioperl-microarray html bioperl-corba-server bioperl-network task-manager bioperl-das-client bioperl-papers xml-html bioperl-db bioperl-pedigree although that's open to feedback from the core members. As a progress report, I've built a demo repos with -run, -ext, and -live in it and asked a couple of folks to to take a peek at it. When I get a bit further along I'll figure out how to get something for the public to test. > Am I right in thinking that there is a way for cvs to define a "project" > such that when you checkout that "project" it actually checks out > multiple projects behind the scene? I'm sure I've seen this somewhere, > possibly when the project is dependent on some 3rd party code that is > also in cvs. If this is possible, I'm sure it will also be possible with > svn. This could then allow something like the following to happen after > the split up of Bioperl. The following projects could be defined: > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > called "bioperl" would actually checkout the real projects call > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > that this ought to be possible, doesn't it? > [...] I don't think that there's any functionality like that in svn. g. From hartzell at alerce.com Wed Jun 20 13:26:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 20 Jun 2007 06:26:04 -0700 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <46791622.6080409@sheffield.ac.uk> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net> <46791622.6080409@sheffield.ac.uk> Message-ID: <18041.10988.375946.833182@almost.alerce.com> Nathan S. Haigh writes: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hilmar Lapp wrote: > > > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote: > > > >> If the switch over to svn takes place, will all the Bioperl-* projects > >> move over at the same time? > > > > They are under the same CVSROOT right now. Locking down some > > sub-repositories but not others may be odd or impossible. > > > >> If so, will they go into their own svn repository or into the same one? > > > > Good question, I'm not sure about the pros and cons one way or the other > > either. The fewer repositories the less sysadmin work in fine-graining > > permissions. > > > > -hilmar > > > > > I don't think there is any major reason why the following single repos > wouldn't do the trick: > > /-- > |-bioperl-live > | |--- trunk > | |--- branches > | |--- tags > | > |-bioperl-run > |--- trunk > |--- branches > |--- tags > > Any reason why this couldn't be used? > [...] That's exactly the way that I'm setting it up. g. From n.haigh at sheffield.ac.uk Wed Jun 20 13:33:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 20 Jun 2007 14:33:33 +0100 Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy) In-Reply-To: <18041.10836.728079.835572@almost.alerce.com> References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com> <467788C5.6070406@sendu.me.uk> <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com> <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu> <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net> <4678C9BC.10206@sheffield.ac.uk> <18041.10836.728079.835572@almost.alerce.com> Message-ID: <46792CAD.5060700@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: > Nathan S. Haigh writes: > > [...] > > If the switch over to svn takes place, will all the Bioperl-* projects > > move over at the same time? If so, will they go into their own svn > > repository or into the same one? Since with svn you can checkout any > > subtree of the repository I'm not clear on the pro's and cons of either > > of these options. > > I'm planning to drop the projects from the top of the CVSROOT into a > single svn repository: > > bioperl-ext bioperl-pipeline biodata bioperl-gui > bioperl-run bioperl-cookbook bioperl-live biosql-schema > bioperl-corba-client bioperl-microarray html bioperl-corba-server > bioperl-network task-manager bioperl-das-client bioperl-papers > xml-html bioperl-db bioperl-pedigree > > although that's open to feedback from the core members. > > As a progress report, I've built a demo repos with -run, -ext, and > -live in it and asked a couple of folks to to take a peek at it. When > I get a bit further along I'll figure out how to get something for the > public to test. Could I take a peek?? > > > Am I right in thinking that there is a way for cvs to define a "project" > > such that when you checkout that "project" it actually checks out > > multiple projects behind the scene? I'm sure I've seen this somewhere, > > possibly when the project is dependent on some 3rd party code that is > > also in cvs. If this is possible, I'm sure it will also be possible with > > svn. This could then allow something like the following to happen after > > the split up of Bioperl. The following projects could be defined: > > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project" > > called "bioperl" would actually checkout the real projects call > > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems > > that this ought to be possible, doesn't it? > > [...] > > I don't think that there's any functionality like that in svn. I did come across this which might help: http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561 Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su sWDAmqFhGgtlyeawaIGSV14= =zeAY -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 20 15:38:20 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 20 Jun 2007 16:38:20 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm Message-ID: <467949EC.9040100@sendu.me.uk> In considering updating all the test scripts to take advantage of the new network option, and/or reimplementing them in Test::More, I thought now would be a good time to standardize all the test scripts and reduce the possibility of having to alter them all in the future if something changes. For example we could decide on an alternate way of choosing to run network tests, or a new way of deciding to output debug information. There are also some inconsistencies in the messages produced by tests skipping all, and even an unfortunate mistake that has been copy/pasted through a lot of test scripts. My solution is t/lib/BioperlTest.pm (documented with perldoc) We go from this: ---- use strict; our $DEBUG; BEGIN { $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; eval { require Test::More; }; if( $@ ) { use lib 't/lib'; } use Test::More; # the mistake! use Module::Build; my $build = Module::Build->current(); my $do_network_tests = $build->notes('network'); eval { require IO::String; require LWP; require LWP::UserAgent; }; if ($@) { plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed. This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests'; } elsif (!$do_network_tests) { plan skip_all => 'Network tests have not been requested, skipping all'; } else { plan tests => 21; } #... } my $obj = Bio::Object->new(-verbose => $DEBUG); #... ---- To this: ---- use strict; BEGIN { use lib 't/lib'; use BioperlTest; test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)], -requires_networking => 1, -tests => 21); #... } my $obj = Bio::Object->new(-verbose => test_debug()); #... ---- Can anyone identify problems with this approach? Is the interface presented by BioperlTest flexible enough that any changes would only be additions for new functionality (and therefore all test scripts wouldn't need to be altered)? Is BioperlTest missing anything you'd like? Are there any objections to me updating all tests in this manner? For an example, see t/RemoteBlast.t Cheers, Sendu. From spiros at lokku.com Wed Jun 20 15:49:48 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Wed, 20 Jun 2007 16:49:48 +0100 Subject: [Bioperl-l] Network tests overhaul In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> References: <467661F0.2060703@sendu.me.uk> <4676A01F.30205@sendu.me.uk> <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu> <4676B41E.3050706@sendu.me.uk> <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu> Message-ID: Yep, they are not all done. Some still need to be ported over, doing some here and there at home. However, the recent email Sendu sent, the one about abstracting the setup of testing is actually something i was thinking myself so it might be a better way to tackle the problem. For once it would save us from duplicating the same 30 lines of code across all tests. As far as network tests are involved, ive always been an avid hater of them. I believe they only bring more troubles than what they contribute due to the diversity of setups people have. My way of tackling them was always to group all the tests that required live access into one file and then forcibly just run that - iff needed and not by default. Like i said, thats just my opinion, ive been bitten by them one time too many. Spiros On 6/18/07, Chris Fields wrote: > > On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote: > > > Chris Fields wrote: > >> Couldn't you enable BIOPERLDEBUG, disable network access, then > >> iterate through tests checking for those which fail or skip? > > > > Yes, good idea, though my dev machine is also my email/webserver so > > I'd rather come up with an alternate solution than one involving > > 'disable network access'. > > > > Still, that's what I'll probably end up doing. Cheers! > > > > > > Oh, Chris, Spiros, how goes the Test::More conversion? I might want > > to wait for you to finish, or join in? If you're not going to have > > time to do any more in the next few weeks, can you please update > > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or > > in the opposite case, add your name in)? Its not quite clear to me > > which tests are assigned to whom. Can someone clarify what the > > markings mean? > > > > Cheers, > > Sendu. > > Not sure how far along spiros is; I handed it over after I finished > up to the 'Q' tests. In general the ones marked out have been > converted over, ones with names next to them have been claimed. If > you need help I'll prob. start back up again to finish them off; we > just need to divy them up. > > chris > From hlapp at gmx.net Wed Jun 20 16:27:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 20 Jun 2007 12:27:47 -0400 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: Very cool! Sounds like a no-brainer to me to adopt this in all the tests. -hilmar On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > In considering updating all the test scripts to take advantage of the > new network option, and/or reimplementing them in Test::More, I > thought > now would be a good time to standardize all the test scripts and > reduce > the possibility of having to alter them all in the future if something > changes. > > For example we could decide on an alternate way of choosing to run > network tests, or a new way of deciding to output debug information. > There are also some inconsistencies in the messages produced by tests > skipping all, and even an unfortunate mistake that has been copy/ > pasted > through a lot of test scripts. > > My solution is t/lib/BioperlTest.pm (documented with perldoc) > > We go from this: > > ---- > use strict; > our $DEBUG; > > BEGIN { > $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > > eval { require Test::More; }; > if( $@ ) { > use lib 't/lib'; > } > use Test::More; # the mistake! > > use Module::Build; > my $build = Module::Build->current(); > my $do_network_tests = $build->notes('network'); > > eval { > require IO::String; > require LWP; > require LWP::UserAgent; > }; > if ($@) { > plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > installed. > This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > tests'; > } > elsif (!$do_network_tests) { > plan skip_all => 'Network tests have not been requested, skipping > all'; > } > else { > plan tests => 21; > } > > #... > } > > my $obj = Bio::Object->new(-verbose => $DEBUG); > #... > ---- > > To this: > > ---- > use strict; > > BEGIN { > use lib 't/lib'; > use BioperlTest; > > test_begin(-requires_modules => [qw(IO::String LWP > LWP::UserAgent)], > -requires_networking => 1, > -tests => 21); > > #... > } > > my $obj = Bio::Object->new(-verbose => test_debug()); > #... > ---- > > > Can anyone identify problems with this approach? Is the interface > presented by BioperlTest flexible enough that any changes would > only be > additions for new functionality (and therefore all test scripts > wouldn't > need to be altered)? Is BioperlTest missing anything you'd like? > > Are there any objections to me updating all tests in this manner? > For an > example, see t/RemoteBlast.t > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 20 16:44:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 11:44:01 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: Agreed! You've already created an example case so there's something to go off of. I plan on changing some EUtilities tests soon so I'll try implementing this, basing off your RemoteBlast.t implementation. Seems clear enough on the surface; if I run into problems I'll post. chris On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > Very cool! Sounds like a no-brainer to me to adopt this in all the > tests. -hilmar > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > >> In considering updating all the test scripts to take advantage of the >> new network option, and/or reimplementing them in Test::More, I >> thought >> now would be a good time to standardize all the test scripts and >> reduce >> the possibility of having to alter them all in the future if >> something >> changes. >> >> For example we could decide on an alternate way of choosing to run >> network tests, or a new way of deciding to output debug information. >> There are also some inconsistencies in the messages produced by tests >> skipping all, and even an unfortunate mistake that has been copy/ >> pasted >> through a lot of test scripts. >> >> My solution is t/lib/BioperlTest.pm (documented with perldoc) >> >> We go from this: >> >> ---- >> use strict; >> our $DEBUG; >> >> BEGIN { >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; >> >> eval { require Test::More; }; >> if( $@ ) { >> use lib 't/lib'; >> } >> use Test::More; # the mistake! >> >> use Module::Build; >> my $build = Module::Build->current(); >> my $do_network_tests = $build->notes('network'); >> >> eval { >> require IO::String; >> require LWP; >> require LWP::UserAgent; >> }; >> if ($@) { >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot >> installed. >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping >> tests'; >> } >> elsif (!$do_network_tests) { >> plan skip_all => 'Network tests have not been requested, >> skipping >> all'; >> } >> else { >> plan tests => 21; >> } >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => $DEBUG); >> #... >> ---- >> >> To this: >> >> ---- >> use strict; >> >> BEGIN { >> use lib 't/lib'; >> use BioperlTest; >> >> test_begin(-requires_modules => [qw(IO::String LWP >> LWP::UserAgent)], >> -requires_networking => 1, >> -tests => 21); >> >> #... >> } >> >> my $obj = Bio::Object->new(-verbose => test_debug()); >> #... >> ---- >> >> >> Can anyone identify problems with this approach? Is the interface >> presented by BioperlTest flexible enough that any changes would >> only be >> additions for new functionality (and therefore all test scripts >> wouldn't >> need to be altered)? Is BioperlTest missing anything you'd like? >> >> Are there any objections to me updating all tests in this manner? >> For an >> example, see t/RemoteBlast.t >> >> >> Cheers, >> Sendu. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From wollenbergk at mail.nih.gov Wed Jun 20 18:11:04 2007 From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID)) Date: Wed, 20 Jun 2007 14:11:04 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others Message-ID: Greetings: I am working on a script to take a list of sequence IDs, extract the sequences from GenPept, and then run a BLAST search for each of the retrieved sequences. I am having a problem with the sequence retrieval, where some sequences are found and others are not and it's not obvious to me why this is. For example, using a text file containing the two following IDs as input: SKG3_YEAST NEM1_YEAST My script while( ) { chomp; my $seqid = $_; my $seq_obj = get_sequence( 'genpept', $seqid ); } will create a sequence object for the first ID, (print "Accession of ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession number) but for the second I am told -------------------- WARNING --------------------- MSG: id (NEM1_YEAST) does not exist --------------------------------------------------- When I pull up these records using the Entrez cross-databse search in my web browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using these search terms). In both records these IDs reside in the same field ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one but not the other. Any advice would be greatly appreciated. Cheers, Kurt Wollenberg, Ph.D. Phylogenetics and Sequence Analysis Consultant Biocomputing Research Consulting Section Bioinformatics and Scientific IT Program (BSIP) NIH/NIAID/OTIS Contractor, Lockheed Martin http://bioinformatics.niaid.nih.gov Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives. From bosborne11 at verizon.net Wed Jun 20 18:59:39 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 20 Jun 2007 14:59:39 -0400 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: Message-ID: Kurt, I can't answer your question but I wouldn't use Bio::Perl myself, I'd use Bio::DB::GenPept: 501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq = $db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;' MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~> It's true that Bio::Perl is easy-to-use but it's also _very_ limited. Brian O. On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)" wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence retrieval, > where some sequences are found and others are not and it's not obvious to me > why this is. > > For example, using a text file containing the two following IDs as input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using > these search terms). In both records these IDs reside in the same field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is confidential > and may contain sensitive information. It should not be used by anyone who > is not the original intended recipient. If you have received this e-mail in > error please inform the sender and delete it from your mailbox or any other > storage devices. National Institute of Allergy and Infectious Diseases shall > not accept liability for any statements made that are sender's own and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Jun 20 20:11:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 20 Jun 2007 15:11:34 -0500 Subject: [Bioperl-l] get_sequence() gets some sequences but not others In-Reply-To: References: Message-ID: I'm assuming you are using the Bio::Perl exported sub get_sequence (). I am able to reproduce the issue using bioperl-live; it's an odd issue as direct use of Bio::DB::GenPept works fine: use Bio::DB::GenPept; my $factory = Bio::DB::GenPept->new(); my @accs = qw(SKG3_YEAST NEM1_YEAST); my $io = $factory->get_Stream_by_acc(\@accs); while (my $seq = $io->next_seq) { print "Accession:",$seq->accession,"\n"; } chris On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote: > Greetings: > > I am working on a script to take a list of sequence IDs, extract the > sequences from GenPept, and then run a BLAST search for each of the > retrieved sequences. I am having a problem with the sequence > retrieval, > where some sequences are found and others are not and it's not > obvious to me > why this is. > > For example, using a text file containing the two following IDs as > input: > SKG3_YEAST > NEM1_YEAST > > My script > > while( ) { > chomp; > my $seqid = $_; > my $seq_obj = get_sequence( 'genpept', $seqid ); > } > > will create a sequence object for the first ID, (print "Accession of > ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct > accession > number) but for the second I am told > > -------------------- WARNING --------------------- > MSG: id (NEM1_YEAST) does not exist > --------------------------------------------------- > > When I pull up these records using the Entrez cross-databse search > in my web > browser I find genpept records for both SKG3_YEAST and NEM1_YEAST > (using > these search terms). In both records these IDs reside in the same > field > ("DBSOURCE swissprot: locus") so I'm mystified why get_sequence > finds one > but not the other. Any advice would be greatly appreciated. > > Cheers, > Kurt Wollenberg, Ph.D. > Phylogenetics and Sequence Analysis Consultant > Biocomputing Research Consulting Section > Bioinformatics and Scientific IT Program (BSIP) > NIH/NIAID/OTIS > Contractor, Lockheed Martin > http://bioinformatics.niaid.nih.gov > > Disclaimer: > The information in this e-mail and any of its attachments is > confidential > and may contain sensitive information. It should not be used by > anyone who > is not the original intended recipient. If you have received this e- > mail in > error please inform the sender and delete it from your mailbox or > any other > storage devices. National Institute of Allergy and Infectious > Diseases shall > not accept liability for any statements made that are sender's own > and not > expressly made on behalf of the NIAID by one of its representatives. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sac at bioperl.org Thu Jun 21 06:32:47 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 20 Jun 2007 23:32:47 -0700 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: References: <467949EC.9040100@sendu.me.uk> Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com> Looks like a nice refactor. After it's in place, don't forget to update the wiki: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Steve On 6/20/07, Chris Fields wrote: > Agreed! You've already created an example case so there's something > to go off of. > > I plan on changing some EUtilities tests soon so I'll try > implementing this, basing off your RemoteBlast.t implementation. > Seems clear enough on the surface; if I run into problems I'll post. > > chris > > On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote: > > > Very cool! Sounds like a no-brainer to me to adopt this in all the > > tests. -hilmar > > > > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote: > > > >> In considering updating all the test scripts to take advantage of the > >> new network option, and/or reimplementing them in Test::More, I > >> thought > >> now would be a good time to standardize all the test scripts and > >> reduce > >> the possibility of having to alter them all in the future if > >> something > >> changes. > >> > >> For example we could decide on an alternate way of choosing to run > >> network tests, or a new way of deciding to output debug information. > >> There are also some inconsistencies in the messages produced by tests > >> skipping all, and even an unfortunate mistake that has been copy/ > >> pasted > >> through a lot of test scripts. > >> > >> My solution is t/lib/BioperlTest.pm (documented with perldoc) > >> > >> We go from this: > >> > >> ---- > >> use strict; > >> our $DEBUG; > >> > >> BEGIN { > >> $DEBUG = $ENV{'BIOPERLDEBUG'} || 0; > >> > >> eval { require Test::More; }; > >> if( $@ ) { > >> use lib 't/lib'; > >> } > >> use Test::More; # the mistake! > >> > >> use Module::Build; > >> my $build = Module::Build->current(); > >> my $do_network_tests = $build->notes('network'); > >> > >> eval { > >> require IO::String; > >> require LWP; > >> require LWP::UserAgent; > >> }; > >> if ($@) { > >> plan skip_all => 'IO::String or LWP or LWP::UserAgentnot > >> installed. > >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping > >> tests'; > >> } > >> elsif (!$do_network_tests) { > >> plan skip_all => 'Network tests have not been requested, > >> skipping > >> all'; > >> } > >> else { > >> plan tests => 21; > >> } > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => $DEBUG); > >> #... > >> ---- > >> > >> To this: > >> > >> ---- > >> use strict; > >> > >> BEGIN { > >> use lib 't/lib'; > >> use BioperlTest; > >> > >> test_begin(-requires_modules => [qw(IO::String LWP > >> LWP::UserAgent)], > >> -requires_networking => 1, > >> -tests => 21); > >> > >> #... > >> } > >> > >> my $obj = Bio::Object->new(-verbose => test_debug()); > >> #... > >> ---- > >> > >> > >> Can anyone identify problems with this approach? Is the interface > >> presented by BioperlTest flexible enough that any changes would > >> only be > >> additions for new functionality (and therefore all test scripts > >> wouldn't > >> need to be altered)? Is BioperlTest missing anything you'd like? > >> > >> Are there any objections to me updating all tests in this manner? > >> For an > >> example, see t/RemoteBlast.t > >> > >> > >> Cheers, > >> Sendu. > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From staffa at niehs.nih.gov Thu Jun 21 18:36:12 2007 From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS)) Date: Thu, 21 Jun 2007 14:36:12 -0400 Subject: [Bioperl-l] BIO::DB::FASTA ID Message-ID: This program below returns only 1527 IDs from a fasta file that I have constructed, which has mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa 1820 . It actually does not return the first 3 ids, nor the 5th, nor 7..36, 38,39,41..44...... The header lines are of variable length and the sequence lines are 80 characters except at the ends when they might be shorter. Is there some caveat that I am ignoring in my format that breaks bio::db::fasta? #!/usr/bin/perl # # # use strict; use Bio::DB::Fasta; use Bio::Tools::SeqWords; use Bio::Seq; use Bio::SeqIO; $|=1; # # my $Dpse_UTR_file_for_T_orthologs = "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; my $db = Bio::DB::Fasta->new ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', -reindex, -makeid => \&make_my_id); my @ids = $db->ids; my $number_in = @ids; print "number of Dpse IDs = $number_in\n"; foreach my $id (@ids){ print "$id\n"; } sub make_my_id { # parse header line: # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT my $line = shift; # print "line = $line\n"; $line =~ />(\w+) /; my $ID = $1; # print "ID = $ID\n"; return $ID; } -------------- next part -------------- A non-text attachment was scrubbed... Name: T_orthologs_Dpse_genes.fa Type: application/octet-stream Size: 5033676 bytes Desc: not available URL: From jason at bioperl.org Thu Jun 21 21:19:14 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 21 Jun 2007 14:19:14 -0700 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: Hey Nick - I think a) your IDs are not unique b) you need to declare the function make_my_id BEFORE your call Bio::DB::Fasta->new if you want your function to be used. $ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort | uniq | wc -l 1527 -jason On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote: > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/ > T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 > TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From mkiwala at watson.wustl.edu Thu Jun 21 21:23:46 2007 From: mkiwala at watson.wustl.edu (Michael Kiwala) Date: Thu, 21 Jun 2007 16:23:46 -0500 Subject: [Bioperl-l] BIO::DB::FASTA ID In-Reply-To: References: Message-ID: <467AEC62.2040508@watson.wustl.edu> You only have 1527 unique id's in the file. ~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\ -f1|sort -u|wc -l 1527 Change your make_id function to make sure the id's are unique. Staffa, Nick (NIH/NIEHS) wrote: > This program below returns only 1527 IDs from a fasta file that I have > constructed, which has > mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa > 1820 > . > It actually does not return the first 3 ids, > nor the 5th, nor 7..36, 38,39,41..44...... > The header lines are of variable length and the sequence lines are 80 > characters except at the ends when they might be shorter. > Is there some caveat that I am ignoring in my format that breaks > bio::db::fasta? > > > #!/usr/bin/perl > # > # > # > use strict; > use Bio::DB::Fasta; > use Bio::Tools::SeqWords; > use Bio::Seq; > use Bio::SeqIO; > $|=1; > # > # > my $Dpse_UTR_file_for_T_orthologs = > "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa"; > my $db = Bio::DB::Fasta->new > ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa', > -reindex, -makeid => \&make_my_id); > my @ids = $db->ids; > my $number_in = @ids; > print "number of Dpse IDs = $number_in\n"; > foreach my $id (@ids){ > print "$id\n"; > } > sub make_my_id { > # parse header line: > # >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT > my $line = shift; > # print "line = $line\n"; > $line =~ />(\w+) /; > my $ID = $1; > # print "ID = $ID\n"; > return $ID; > } > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Jun 25 13:06:27 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:06:27 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467949EC.9040100@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> Message-ID: <467FBDD3.8050009@sendu.me.uk> Sendu Bala wrote: > In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm I'm now in the process of converting all test scripts. In addition to those things mentioned previously, BioperlTest now also provides the methods test_input_file() and test_output_file(). This: ---- use Bio::Root::IO; my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); $obj->new(-file => ">$output_file"); END { unlink($output_file); } ... $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); ---- Becomes this: ---- my $output_file = test_output_file(); $obj->new(-file => ">$output_file"); ... $obj->new(-file => test_input_file('input.file')); ---- I should think the benefits are obvious, especially for the output files, which thanks to inconsistency of using END blocks correctly or at all, leaves some output data behind on occasion. test_input_file() is helpful for the shorthand, but also gets rid of many tests' usage of Bio::Root::IO (relying on something you're installing and testing in another test script to work in the current test script, without testing it in your own test script seems like a no-no to me). From cjfields at uiuc.edu Mon Jun 25 13:39:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:39:21 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] t/lib/ >> BioperlTest.pm > > I'm now in the process of converting all test scripts. In addition to > those things mentioned previously, BioperlTest now also provides the > methods test_input_file() and test_output_file(). > > > This: > ---- > use Bio::Root::IO; > my $output_file = Bio::Root::IO->catfile(qw(t data temp.file)); > $obj->new(-file => ">$output_file"); > > END { > unlink($output_file); > } > > ... > > $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file))); > ---- > > > Becomes this: > ---- > my $output_file = test_output_file(); > $obj->new(-file => ">$output_file"); > > ... > > $obj->new(-file => test_input_file('input.file')); > ---- > > > I should think the benefits are obvious, especially for the output > files, which thanks to inconsistency of using END blocks correctly > or at > all, leaves some output data behind on occasion. Sounds fine by me, though it's a lot of work. BTW, did we ever decide whether to finish up with Test::More conversion? I haven't heard back yet; let me know what you want to do. > test_input_file() is helpful for the shorthand, but also gets rid of > many tests' usage of Bio::Root::IO (relying on something you're > installing and testing in another test script to work in the current > test script, without testing it in your own test script seems like a > no-no to me). Well, in a way isn't that itself a test of the class (whether it breaks or not)? ; > Do test_input_file() and test_input_file() handle directory structures in an OS-safe way like catfile()? For instance, I plan on adding test data to a new directory similar to Bio::Graphics (t/data/ eutil) to prevent cluttering of the t/data directory. I could use '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base directory is 't/data' but that may not be cross-platform compatible with win32 file systems, which may still expect something like 't\data \eutil\input.xml'. chris From bix at sendu.me.uk Mon Jun 25 13:45:23 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 25 Jun 2007 14:45:23 +0100 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> Message-ID: <467FC6F3.6080705@sendu.me.uk> Chris Fields wrote: > On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >> I should think the benefits are obvious, especially for the output >> files, which thanks to inconsistency of using END blocks correctly or at >> all, leaves some output data behind on occasion. > > Sounds fine by me, though it's a lot of work. BTW, did we ever decide > whether to finish up with Test::More conversion? I haven't heard back > yet; let me know what you want to do. I'm doing the remaining Test::More conversions at the same time. > Do test_input_file() and test_input_file() handle directory structures > in an OS-safe way like catfile()? For instance, I plan on adding test > data to a new directory similar to Bio::Graphics (t/data/eutil) to > prevent cluttering of the t/data directory. I could use > '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base > directory is 't/data' but that may not be cross-platform compatible with > win32 file systems, which may still expect something like > 't\data\eutil\input.xml'. Its platform-independent, currently implemented using File::Spec. So you'll say: $obj->new(-file => test_input_file('eutil', 'input.xml')); Its all documented in the POD of BioperlTest. From cjfields at uiuc.edu Mon Jun 25 13:49:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 08:49:51 -0500 Subject: [Bioperl-l] New testing base: BioperlTest.pm In-Reply-To: <467FC6F3.6080705@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu> <467FC6F3.6080705@sendu.me.uk> Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu> On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote: >>> I should think the benefits are obvious, especially for the output >>> files, which thanks to inconsistency of using END blocks >>> correctly or at >>> all, leaves some output data behind on occasion. >> Sounds fine by me, though it's a lot of work. BTW, did we ever >> decide whether to finish up with Test::More conversion? I haven't >> heard back yet; let me know what you want to do. > > I'm doing the remaining Test::More conversions at the same time. Okay. Just didn't want to do any redundant work if it's already being/been done. >> Do test_input_file() and test_input_file() handle directory >> structures in an OS-safe way like catfile()? For instance, I plan >> on adding test data to a new directory similar to Bio::Graphics (t/ >> data/eutil) to prevent cluttering of the t/data directory. I >> could use '$obj->new(-file => test_input_file('/eutil/ >> input.xml'))' if the base directory is 't/data' but that may not >> be cross-platform compatible with win32 file systems, which may >> still expect something like 't\data\eutil\input.xml'. > > Its platform-independent, currently implemented using File::Spec. > So you'll say: > > $obj->new(-file => test_input_file('eutil', 'input.xml')); > > Its all documented in the POD of BioperlTest. yay! chris From mmokrejs at ribosome.natur.cuni.cz Mon Jun 25 16:06:24 2007 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Mon, 25 Jun 2007 18:06:24 +0200 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <467254DD.3010505@mrc-lmb.cam.ac.uk> Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz> Dave Howorth wrote: > Martin MOKREJ? wrote: >>>> Also, there is a *huge* amount of documentation and examples on >>>> the BioPerl website. >>>> >>>> http://www.bioperl.org/wiki/HOWTOs >>> You mean >>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File >>> ? ;-) >> $ perl embl2picture.pl ~/99.gb | display - Error returned while >> evaluating value of 'description' option for glyph >> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature >> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method >> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl >> line 141, line 125. > > Hmm an error at line 141 of a 69 line script? Methinks you're not > actually running the script that's presented on the wiki page you > quoted. I cut-and-pasted the script and your file and it worked for me > (at least, it produced an image, along with a bunch of OOPS lines) Maybe you used the first version of the script? There are two or more scripts, I used the very last one. M. From cjfields at uiuc.edu Mon Jun 25 16:48:30 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 25 Jun 2007 11:48:30 -0500 Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file? In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz> References: <466938F6.7050903@ribosome.natur.cuni.cz> <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu> <467178AE.5040905@ribosome.natur.cuni.cz> <46717990.6040509@ribosome.natur.cuni.cz> <46723F91.60501@ribosome.natur.cuni.cz> <467FE7B0.3010904@ribosome.natur.cuni.cz> Message-ID: Martin, Keep bioperl-related discussion on the bioperl mail list. The large majority of this isn't biopython-related, but maybe some devs there can add to this? On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote: ... > Would you please tell me exactly what is wrong with the spacing? Here's a section of the seq record attached to your previous email: DEFINITION . ACCESSION . VERSION . SOURCE . ORGANISM . Normally there is a fixed column width for any data present in a field, so it would look more like this: DEFINITION PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase); dihydroorotase [Arabidopsis thaliana]. ACCESSION NP_194024 VERSION NP_194024.1 GI:15235865 DBSOURCE REFSEQ: accession NM_118422.3 KEYWORDS . SOURCE Arabidopsis thaliana (thale cress) ORGANISM Arabidopsis thaliana Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons; rosids; eurosids II; Brassicales; Brassicaceae; Arabidopsis. Here's the relevant bit in the latest release notes: "The second part of each sequence entry record contains the information appropriate to its keyword, in positions 13 to 80 for keywords and positions 11 to 80 for the sequence." The bioperl devs try to make our parsers as flexible as possible but others may not, so it's something in ApE that should probably be fixed. And as mentioned to you several times in the past on the mail list and on bugzilla, don't expect sequence records which sway from the standard (in this case, the release notes) to parse correctly in all cases. We can try supporting some that sway from that standard but only up to a point. If it causes additional bugs, headaches, or degrades performance it won't be supported. > ... > Well, I just copy&pasted the script from the bioperl webpages, I think > from a tutorial or FAQ, don't remember anymore. Well, can't help you if you can't point out where the code originated from. We would like to know so it can be corrected. > ... > Well, my search for such tools available on Unix to be used in a > script, > non-interactively, completely failed. My last hope except getting > improved > ApE is to use the GenomeDiagram under biopython, but so far my .gb > files > cannot be parsed yet. :( > Martin As mentioned previously you will likely have to code for it yourself (perl or python) or help debug the relevant biopython code to get it working. We can't/won't do this for you unless/until it's something we feel warrants implementation. Judging by the bug list, we also haven't the time nor inclination to code for it. Sorry but we have other priorities besides doing your work for you. chris From jesper at krogh.cc Tue Jun 26 07:05:32 2007 From: jesper at krogh.cc (Jesper Krogh) Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST) Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Hi List. Trying to parse the embl database, the embl-parser fails on: AB019196 http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: AB019196 seems to have an invalid species classification. STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 STACK: Bio::SeqIO::embl::_read_EMBL_Species /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 STACK: Bio::SeqIO::embl::next_seq /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 STACK: -e:1 ----------------------------------------------------------- It seems to be dissatisfied with this: OS Acetobacter aceti OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. Thanks. -- Jesper Krogh From cjfields at uiuc.edu Tue Jun 26 13:13:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 08:13:50 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> I can verify this using bioperl-live. Can you file this as a bug? http://bugzilla.open-bio.org/ chris On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > Hi List. > > Trying to parse the embl database, the embl-parser fails on: AB019196 > http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: AB019196 seems to have an invalid species classification. > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 > STACK: Bio::SeqIO::embl::_read_EMBL_Species > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 > STACK: Bio::SeqIO::embl::next_seq > /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 > STACK: -e:1 > ----------------------------------------------------------- > > > It seems to be dissatisfied with this: > OS Acetobacter aceti > OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; > OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. > > Thanks. > -- > Jesper Krogh > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From suji_ramin at yahoo.com Tue Jun 26 04:58:36 2007 From: suji_ramin at yahoo.com (SujiBala) Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT) Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com> Hi Hello This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. Error messasge Must supply a valid Bio::Align::AlignI for the _align parameter in the distance My program use Bio::AlignIO; use Bio::Align::DNAStatistics; use Bio::Tree::DistanceFactory; # for a dna alignment can also use ProteinStatistics @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); $stats = Bio::Align::DNAStatistics->new; $mat = $stats->distance( -align => @aln,-method => 'Kimura'); $dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ'); $tree = $dfactory->make_tree($mat); I am using clustalw formatted fasta file with more than one sequence SujiBala --------------------------------- Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search. From bartels.stefan at mh-hannover.de Tue Jun 26 09:26:03 2007 From: bartels.stefan at mh-hannover.de (don esteban) Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT) Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: References: Message-ID: <11302459.post@talk.nabble.com> Try using the Proxyconfiguration in your script: $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; L Xu wrote: > > I do have the internet connection bu not use the proxy server. > I tested the network connection with ping command (below). The ncbi > website > does not response. Is there any special network setting needed for > connecting the ncbi website? > Thank you so much. > > C:\>ping www.yahoo.com > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > Ping statistics for 69.147.114.210: > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > Approximate round trip times in milli-seconds: > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > C:\>ping www.ncbi.nlm.nih.gov > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > Request timed out. > Request timed out. > Request timed out. > Request timed out. > > Ping statistics for 130.14.29.110: > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > = = = Original message = = = > > Judging by the output it looks like you have no network access or? can't > connect to the server (what remoteblast needs).? Make sure you? don't need > proxy settings. > > To preempt the next question, no, I'm not going to explain what a? proxy > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > tool... > > chris > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > ... > -------------------- WARNING --------------------- > MSG: > An Error Occurred > >

An Error Occurred

> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > --------------------------------------------------- > ... > > ___________________________________________________________ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. > > _________________________________________________________________ > Get a preview of Live Earth, the hottest event this summer - only on MSN > http://liveearth.msn.com?source=msntaglineliveearthhm > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From rahall2 at ualr.edu Tue Jun 26 13:51:08 2007 From: rahall2 at ualr.edu (Roger Hall) Date: Tue, 26 Jun 2007 08:51:08 -0500 Subject: [Bioperl-l] Tuesday: ill Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2> Well I guess I won't be in today after all. Michael, Stephen, and Ames: please call me from the grad office at 10 on my cell phone (744-8514). Phil: please go ahead and meet with Tim, and let me know what questions remain afterwards. Thanks! Roger Hall Technical Director MidSouth Bioinformatics Center University of Arkansas at Little Rock (501) 569-8074 From cjfields at uiuc.edu Tue Jun 26 14:02:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 09:02:29 -0500 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <4681185D.5030402@cam.ac.uk> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> <4681185D.5030402@cam.ac.uk> Message-ID: Ill try getting to that ASAP (as well as a few bugs). The problem is we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due to repeated code issues, something I'm trying to rectify with a new set of parsers. Just haven't had the time to work on them lately unfortunately. chris On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote: > Sorry, replied to this but forgot to cc the list. > > It looks like a related problem to bug 2288 that I filed about > Bio::SeqIO::swiss - the period after subgen. is what causes the > problems since it is interpreted as a seperator between nodes. I > put a patch in for Bio::SeqIO::swiss that works for me, but I guess > it might have side effects. > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. > > Chris Fields wrote: >> I can verify this using bioperl-live. Can you file this as a bug? >> http://bugzilla.open-bio.org/ >> chris >> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: >>> Hi List. >>> >>> Trying to parse the embl database, the embl-parser fails on: >>> AB019196 >>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >>> >>> >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: AB019196 seems to have an invalid species classification. >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ >>> Root.pm:359 >>> STACK: Bio::SeqIO::embl::_read_EMBL_Species >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >>> STACK: Bio::SeqIO::embl::next_seq >>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >>> STACK: -e:1 >>> ----------------------------------------------------------- >>> >>> >>> It seems to be dissatisfied with this: >>> OS Acetobacter aceti >>> OC Bacteria; Proteobacteria; Alphaproteobacteria; >>> Rhodospirillales; >>> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >>> >>> Thanks. >>> -- >>> Jesper Krogh >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From rrc22 at cam.ac.uk Tue Jun 26 13:45:01 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 26 Jun 2007 14:45:01 +0100 Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk> <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu> Message-ID: <4681185D.5030402@cam.ac.uk> Sorry, replied to this but forgot to cc the list. It looks like a related problem to bug 2288 that I filed about Bio::SeqIO::swiss - the period after subgen. is what causes the problems since it is interpreted as a seperator between nodes. I put a patch in for Bio::SeqIO::swiss that works for me, but I guess it might have side effects. Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. Chris Fields wrote: > I can verify this using bioperl-live. Can you file this as a bug? > > http://bugzilla.open-bio.org/ > > chris > > On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote: > >> Hi List. >> >> Trying to parse the embl database, the embl-parser fails on: AB019196 >> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196 >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: AB019196 seems to have an invalid species classification. >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359 >> STACK: Bio::SeqIO::embl::_read_EMBL_Species >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091 >> STACK: Bio::SeqIO::embl::next_seq >> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322 >> STACK: -e:1 >> ----------------------------------------------------------- >> >> >> It seems to be dissatisfied with this: >> OS Acetobacter aceti >> OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales; >> OC Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter. >> >> Thanks. >> -- >> Jesper Krogh >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Tue Jun 26 14:13:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 26 Jun 2007 15:13:48 +0100 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> Message-ID: <46811F1C.3020307@sendu.me.uk> SujiBala wrote: > Hi Hello > This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. > > Error messasge > Must supply a valid Bio::Align::AlignI for the _align parameter in the distance > My program > use Bio::AlignIO; > use Bio::Align::DNAStatistics; > use Bio::Tree::DistanceFactory; > # for a dna alignment can also use ProteinStatistics > @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); > $stats = Bio::Align::DNAStatistics->new; > $mat = $stats->distance( -align => @aln,-method => 'Kimura'); Without looking at the docs for these modules, it is immediately obvious that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO and not an array of alignments. It is also obvious that the -align => parameter for the distance() method can't take an array of anything (but probably an array ref?). Check the documentation and make sure you know what objects you're generating and passing around. From schlesi at ebi.ac.uk Tue Jun 26 14:59:13 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Tue, 26 Jun 2007 15:59:13 +0100 Subject: [Bioperl-l] PAML parser Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Hello, I am trying to use the PAML result parser (BioPerl Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. However on all outputs I have tested no result object is returned (next_result is undef). This includes the HIV and Lysin datasets included with PAML. My code is: my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => "/."); my $result = $codemlp->next_result; foreach my $model ( $result->get_NSSite_results ) { ... and the error is: Can't call method "get_NSSite_results" on an undefined value ... I can include the mlc file is needed. Is this supposed to work? Or do I have to run paml from bioperl to parse the results? Thanks Felix From Xianjun.Dong at bccs.uib.no Tue Jun 26 14:35:17 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 16:35:17 +0200 Subject: [Bioperl-l] bug for PAML::Baseml Message-ID: <46812425.8000509@ii.uib.no> An HTML attachment was scrubbed... URL: From Xianjun.Dong at bccs.uib.no Tue Jun 26 15:40:47 2007 From: Xianjun.Dong at bccs.uib.no (Xianjun Dong) Date: Tue, 26 Jun 2007 17:40:47 +0200 Subject: [Bioperl-l] bug for PAML::Baseml In-Reply-To: <46812425.8000509@ii.uib.no> References: <46812425.8000509@ii.uib.no> Message-ID: <4681337F.1000902@ii.uib.no> An HTML attachment was scrubbed... URL: From hartzell at alerce.com Tue Jun 26 18:12:04 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 14:12:04 -0400 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.22260.967524.353173@almost.alerce.com> There don't seem to be any .cvsignore files in the repository, or in CVSROOT/cvsignore. Am I missing something, or don't we use them? g. From cjfields at uiuc.edu Tue Jun 26 19:54:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 26 Jun 2007 14:54:25 -0500 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu> Not sure. You may want to email support at open-bio.org; my guess is Chris D or Jason would have an answer. chris On Jun 26, 2007, at 1:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Tue Jun 26 19:55:21 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 26 Jun 2007 16:55:21 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: Maybe we've been using the default? On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Tue Jun 26 20:21:30 2007 From: hartzell at alerce.com (George Hartzell) Date: Tue, 26 Jun 2007 16:21:30 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> Message-ID: <18049.30026.61328.134490@almost.alerce.com> Chris Fields writes: > [...] > It looks like George Hartzell may be taking a crack at it, with > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > could have something testable relatively soon. After that we'll need > to work out a few other issues, basically what's on Hilmar's list. There's a repository on file:///home/hartzell/bioperl with all of the components projects in place. If you have a dev.open-bio.org account and you're in the bioperl group, you're good to get at it via: file:///home/hartzell/bioperl or svn+ssh://dev.open-bio.org/home/hartzell/bioperl There are a couple of things to think about: - how are we going to provide access. I *think* that I heard a decision to use http:// and https://. Who gets to set that up? - what do we want to do about keywords. The cvs2svn tool guesses and automatically sets the svn:keywords property to Author Date Revision and Id on many of the files in the tree. If it looks like it got it right, we can stick with it. Or, we can disable that conversion and I've cribbed a little script that'll grep out files using Id and set the svn:keywords property accordingly. - what do we want to do about svn:ignore? I haven't seen any .cvsignore files. Beyond that, how does the repo look? How are we going to cut over? Are we going to try to push svn commits to the read-mostly CVS repo, or just keep it around for history's sake (I lean towards the latter). g. From jason at bioperl.org Tue Jun 26 23:22:20 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:22:20 -0300 Subject: [Bioperl-l] PAML parser In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com> Message-ID: Can you make sure you have the latest and greatest version of these modules from the CVS repository? We had to fix things to parse 3.15 -- I can't tell if this is the problem or something else. You can also add -verbose => 1when you initialize the object and it may spit out more warnings about whether it is having problems. -jason On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote: > Hello, > > I am trying to use the PAML result parser (BioPerl > Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15. > However on all outputs I have tested no result object is returned > (next_result is undef). This includes the HIV and Lysin datasets > included with PAML. > My code is: > > my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir => > "/."); > my $result = $codemlp->next_result; > foreach my $model ( $result->get_NSSite_results ) { > ... > > and the error is: Can't call method "get_NSSite_results" on an > undefined value ... > > I can include the mlc file is needed. Is this supposed to work? Or do > I have to run paml from bioperl to parse the results? > > Thanks > Felix > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 23:27:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:27:05 -0300 Subject: [Bioperl-l] Error in constructing Phylogenetic tree using BioPerl In-Reply-To: <46811F1C.3020307@sendu.me.uk> References: <571051.26423.qm@web51107.mail.re2.yahoo.com> <46811F1C.3020307@sendu.me.uk> Message-ID: On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote: > SujiBala wrote: >> Hi Hello >> This is sujatha from singapore. I am trying to construct phylo >> tree using DNAStatistics and Kirma method. But I am getting the >> following error message. It would be nice if you could help me >> resolve this problem asap. >> >> Error messasge >> Must supply a valid Bio::Align::AlignI for the _align >> parameter in the distance >> My program >> use Bio::AlignIO; >> use Bio::Align::DNAStatistics; >> use Bio::Tree::DistanceFactory; >> # for a dna alignment can also use ProteinStatistics >> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw'); >> $stats = Bio::Align::DNAStatistics->new; >> $mat = $stats->distance( -align => @aln,-method => 'Kimura'); > yep you want to call next_aln on the Bio::AlignIO object. I fixed the example code in the HOWTO so it should work properly now; http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees > Without looking at the docs for these modules, it is immediately > obvious > that Bio::AlignIO->new() is going to return an instance of > Bio::AlignIO > and not an array of alignments. It is also obvious that the -align => > parameter for the distance() method can't take an array of anything > (but > probably an array ref?). > > Check the documentation and make sure you know what objects you're > generating and passing around. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Tue Jun 26 23:29:11 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 26 Jun 2007 20:29:11 -0300 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org> We don't have one. I have one on my local machine that defined basically *~ and .#* so I never had a problem. Feel free to propose one if you think it is important, I never really though it was important. On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote: > Maybe we've been using the default? > > On Jun 26, 2007, at 3:12 PM, George Hartzell wrote: > >> >> There don't seem to be any .cvsignore files in the repository, or in >> CVSROOT/cvsignore. >> >> Am I missing something, or don't we use them? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From j_martin at lbl.gov Wed Jun 27 01:01:29 2007 From: j_martin at lbl.gov (Joel Martin) Date: Tue, 26 Jun 2007 18:01:29 -0700 Subject: [Bioperl-l] Example code in Bioperl Tutorial In-Reply-To: <11302459.post@talk.nabble.com> References: <11302459.post@talk.nabble.com> Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org> Hello, The tutorial code snippet is an endless loop, I think it's supposed to remove the rid. As the only print statement you added is after the endless loop, you aren't seeing anything happen. Use the code from this instead, perldoc Bio::Tools::Run::RemoteBlast The bptutorial.pl does have a note that it's not useful and to read the pod for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code snippet you used. Though, as it's a tutorial example it might be nice to remove the while loop .. or at least add the sleep(5) part. http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29 Aside from that, you may have network issues but www.ncbi.nlm.nih.gov doesn't respond to ping as far as I can tell. Joel On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote: > > Try using the Proxyconfiguration in your script: > > $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080"; > > > > > L Xu wrote: > > > > I do have the internet connection bu not use the proxy server. > > I tested the network connection with ping command (below). The ncbi > > website > > does not response. Is there any special network setting needed for > > connecting the ncbi website? > > Thank you so much. > > > > C:\>ping www.yahoo.com > > > > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data: > > > > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45 > > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45 > > > > Ping statistics for 69.147.114.210: > > Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), > > Approximate round trip times in milli-seconds: > > Minimum = 312ms, Maximum = 363ms, Average = 338ms > > > > C:\>ping www.ncbi.nlm.nih.gov > > > > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data: > > > > Request timed out. > > Request timed out. > > Request timed out. > > Request timed out. > > > > Ping statistics for 130.14.29.110: > > Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), > > > > > > > > = = = Original message = = = > > > > Judging by the output it looks like you have no network access or? can't > > connect to the server (what remoteblast needs).? Make sure you? don't need > > proxy settings. > > > > To preempt the next question, no, I'm not going to explain what a? proxy > > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful > > tool... > > > > chris > > > > On Jun 13, 2007, at 7:16 AM, L Xu wrote: > > > > > > ... > > -------------------- WARNING --------------------- > > MSG: > > An Error Occurred > > > >

An Error Occurred

> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error) > > > > > > > > --------------------------------------------------- > > ... > > > > ___________________________________________________________ > > Sent by ePrompter, the premier email notification software. > > Free download at http://www.ePrompter.com. > > > > _________________________________________________________________ > > Get a preview of Live Earth, the hottest event this summer - only on MSN > > http://liveearth.msn.com?source=msntaglineliveearthhm > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melvinp at pacific.net.sg Wed Jun 27 05:25:08 2007 From: melvinp at pacific.net.sg (Melvin P) Date: Wed, 27 Jun 2007 13:25:08 +0800 Subject: [Bioperl-l] finding statistics on AA Message-ID: <4681F4B4.8010609@pacific.net.sg> Hi, I am new to BioPerl. I am trying to find out if there is any class that I can use for occupancy number/occurrence counts, psuedo count, observed frequency etc given a few sequences of amino acid. For example, what is the observed frequency of residue i at position p. My objective is to analyze the information content. Thanks. From bix at sendu.me.uk Wed Jun 27 10:23:58 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 11:23:58 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <467FBDD3.8050009@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> Message-ID: <46823ABE.2080300@sendu.me.uk> Sendu Bala wrote: > Sendu Bala wrote: >> In considering updating all the test scripts to [... use] >> t/lib/BioperlTest.pm > > I'm now in the process of converting all test scripts. And I've now completed that job (for bioperl-live at least), except for t/EUtilities.t since I know Chris is working on it. In addition to converting to Test::More where necessary, I've also made all psuedo-TODO blocks real ones. Previously I had advised to use SKIP blocks instead since TODO blocks need a Test::Harness upgrade. However I think in the next release we ought to make such upgrading compulsory (which should be automatic when combined with compulsory usage of Module::Build and Test::More in turn: users simply have to update CPAN). The conversion to BioperlTest directly led to the discovery and fixing of 6 minor bugs, so was certainly not without merit. No user or developer needs to have BIOPERLDEBUG permanently set to true anymore. To run all tests you just have to answer yes to the BioDBGFF and networking questions of 'perl Build.PL'. With './Build test' you then get clean, easy-to-read output where it is obvious to see that we currently have these issues: t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in another thread. t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and t/Annotation.t all have TODO tests. If you know about those modules, now would be a great time to implement those TODOs! Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are deprecated' warnings. To debug a particular test you could say: BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t I've updated the HOWTO for writing test scripts: http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests From cjfields at uiuc.edu Wed Jun 27 11:55:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 06:55:47 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except > for > t/EUtilities.t since I know Chris is working on it. The network tests will be much shorter; the bulk will be transferred to a new suite for the backend Bio::Tools:EUtilities parser (which will test static files in t/data/eutils, so no dynamic changes). > In addition to converting to Test::More where necessary, I've also > made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. > However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update > CPAN). Sounds good to me, but there may be some grumblings out there. Having specific TODOs are nice b/c we can test them w/o fails. Handy. > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to > true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those > modules, now > would be a great time to implement those TODOs! The RNA_SearchIO.t is from ERPIN output; there's no easy way to generate it beyond having the user supply the info (or having the program author change the output). Will have to look at the others to see what's involved; maybe something for the priority list? > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. I ran into this with XML::Simple data structures recently; there was an easy way around it via XML::Simple using forcearray(). It has to do with attempting to assign data to/from a hash in a specific way involving array references (though I can't remember exactly how; I slept since then). > To debug a particular test you could say: > BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t > > > I've updated the HOWTO for writing test scripts: > http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests Good work! chris From schlesi at ebi.ac.uk Wed Jun 27 11:57:27 2007 From: schlesi at ebi.ac.uk (Felix Schlesinger) Date: Wed, 27 Jun 2007 12:57:27 +0100 Subject: [Bioperl-l] Selecting columns from alignment Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com> Hi, is there an elegant way to select columns from an alignment object fulfilling a certain property (for example less than x gaps)? Everything I can see from Align::AlignI seems to involve looking at the individual sequences, creating lots of slices and appending them. If there a better way in bioperl or failing that, does anyone know a software package with similar functionality (t-coffee has lots of filters for alignments, but nothing to select columns besides by position it seems). Ideally this would also return a mapping from old to new positions in one of the sequences of course. Thanks Felix From cjfields at uiuc.edu Wed Jun 27 14:36:41 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 09:36:41 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ... > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I managed to get it working using file://. Haven't tried svn+ssh yet but I've had persistent problems getting ssh to work properly on my macbook; not sure why yet but I haven't had time to play around with it. > There are a couple of things to think about: > > - how are we going to provide access. I *think* that I heard a > decision to use http:// and https://. Who gets to set that up? That hasn't been decided yet and will be up to a consensus of the core devs, but I think the odds are in favor of allowing https:// but against allowing http://. As for setup that could be anyone with admin privs, though it may be best left up to Chris D, Jason, or Mauricio. > - what do we want to do about keywords. The cvs2svn tool guesses > and automatically sets the svn:keywords property to Author Date > Revision and Id on many of the files in the tree. If it looks > like it got it right, we can stick with it. Or, we can disable > that conversion and I've cribbed a little script that'll grep out > files using Id and set the svn:keywords property accordingly. Probably again a consensus issue, but you can choose one route. My inclination is the former if it's easier. > - what do we want to do about svn:ignore? I haven't seen any > .cvsignore files. Not sure. I've never used one personally, but (as Jason suggests) if you have ideas for one you can propose them, or we can suggest devs set up svn::ignore locally. > Beyond that, how does the repo look? Seems fine, though a simple 'svn file:///home/hartzell/bioperl' checkout gets everything (all distros, branches, etc). We need to make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- live/trunk /live' or similar if they just want the latest core/db/etc. We'll also need to start a svn wiki page to show how to get relevant distros (similar in style probably to the cvs page, with dev information, how to set up ssh keys, https stuff, etc). > How are we going to cut over? > > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I think a clean cut-over. Everyone would be warned to hold commits for a day (lest they be lost), then probably do something in this order: - switch cvs to read-only except for svn commits - run a clean cvs2svn - set up svn as read/write - set up test commits to cvs via svn - disable cvs commit messages to bioperl-guts, enable svn commit messages in it's place. - push svn commits over to read-only cvs cvs >>must<< be read-only after that point (no cvs->svn commits), with write access only available through svn. If at some future point there is no reason to keep it around or that it is more trouble than it's worth, we can make a decision then on cvs's fate. > g. chris From rvos at interchange.ubc.ca Wed Jun 27 14:23:25 2007 From: rvos at interchange.ubc.ca (rvos) Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT) Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> > Are we going to try to push svn commits to the read-mostly CVS repo, > or just keep it around for history's sake (I lean towards the latter). I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point. Rutger From cjfields at uiuc.edu Wed Jun 27 15:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 10:18:03 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> On Jun 27, 2007, at 9:23 AM, rvos wrote: > >> Are we going to try to push svn commits to the read-mostly CVS repo, >> or just keep it around for history's sake (I lean towards the >> latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. > > Rutger Most projects make a clean break with cvs (no more commits) for the reasons you point out. Not sure how the other core devs feel about that but I could go for that; it would def. prevent headaches. We could keep cvs for the time being as read-only, with no svn->cvs syncing. There are few projects which have (as a phase-out plan) old read-only cvs repositories available, with an automatic svn->cvs commit following every new svn commit. Not sure how that works, esp. for branching/merging and so on which I could see potentially getting hairy. chris From cjfields at uiuc.edu Wed Jun 27 16:05:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 11:05:49 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > ...If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl > > or > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl Did manage to get svn+ssh working (with some password harassment); core tests passed enough that I think everything's okay. If ssh keys are set up correctly (mine aren't) it should work fine. chris From dmessina at wustl.edu Wed Jun 27 16:27:32 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 11:27:32 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: > [Chris] > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around > with it. I just did a checkout and a test commit, both via svn+ssh -- works great for me. >> [George] >> >> - what do we want to do about keywords. The cvs2svn tool guesses >> and automatically sets the svn:keywords property to Author Date >> Revision and Id on many of the files in the tree. If it looks >> like it got it right, we can stick with it. Or, we can disable >> that conversion and I've cribbed a little script that'll grep out >> files using Id and set the svn:keywords property accordingly. I would think we would want "Author Date Id Rev URL" set on everything, no?. So either cvs2svn or your tool (whichever you think is better), followed by svn propset svn:keywords "Author Date Id Rev URL" * from the root of a working copy would take care of all of the existing files in the repository, I think. George knows more about this than I do, but I think you can set up a global config file with enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" to ensure it gets set on any future additions to the repository. >> - what do we want to do about svn:ignore? I haven't seen any >> .cvsignore files. > > Not sure. I've never used one personally, but (as Jason suggests) if > you have ideas for one you can propose them, or we can suggest devs > set up svn::ignore locally. I use the default global-ignores global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store (again, in my system-wide config file), but I'm not tied to that. I do think we should have one, though; individuals can easily override any settings in the system-wide config with their own ~/.subversion/ config. >> Beyond that, how does the repo look? Looks great, George! Thanks for doing this. Dave From hartzell at alerce.com Wed Jun 27 17:00:53 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 13:00:53 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> Message-ID: <18050.38853.526224.791878@almost.alerce.com> rvos writes: > > > Are we going to try to push svn commits to the read-mostly CVS repo, > > or just keep it around for history's sake (I lean towards the latter). > > I'm a little confused - surely once the svn is up and running we'll > want *no more* cvs commits? Parallel repositories that each > accumulate stuff will be a nightmare. I'm probably just not getting > your point. There had been some point of keeping a CVS repository around as a read-only mirror of the svn repo, presumably for people who's habits or setup won't let them use svn. In theory, each commit to the svn repo can be automagically pushed down into CVS w/out user intervention, google will tell you how but I've never run anything that way. g. From dmessina at wustl.edu Wed Jun 27 17:27:01 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 12:27:01 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu> > [Chris] > We'll also need to start a svn wiki page to show how to get relevant > distros (similar in style probably to the cvs page, with dev > information, how to set up ssh keys, https stuff, etc). I cloned the CVS page and have started adapting it for Subversion: http://www.bioperl.org/wiki/Using_Subversion I'll do some more on it later today, but if anyone wants to fiddle with it in the interim, please do. Dave From n.haigh at sheffield.ac.uk Wed Jun 27 18:44:16 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 19:44:16 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <46823ABE.2080300@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> Message-ID: <4682B000.2050707@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Sendu Bala wrote: >>> In considering updating all the test scripts to [... use] >>> t/lib/BioperlTest.pm >> I'm now in the process of converting all test scripts. > > And I've now completed that job (for bioperl-live at least), except for > t/EUtilities.t since I know Chris is working on it. > > > In addition to converting to Test::More where necessary, I've also made > all psuedo-TODO blocks real ones. Previously I had advised to use SKIP > blocks instead since TODO blocks need a Test::Harness upgrade. However I > think in the next release we ought to make such upgrading compulsory > (which should be automatic when combined with compulsory usage of > Module::Build and Test::More in turn: users simply have to update CPAN). > > > The conversion to BioperlTest directly led to the discovery and fixing > of 6 minor bugs, so was certainly not without merit. > > > No user or developer needs to have BIOPERLDEBUG permanently set to true > anymore. To run all tests you just have to answer yes to the BioDBGFF > and networking questions of 'perl Build.PL'. With './Build test' you > then get clean, easy-to-read output where it is obvious to see that we > currently have these issues: > > t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in > another thread. > > t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, > t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and > t/Annotation.t all have TODO tests. If you know about those modules, now > would be a great time to implement those TODOs! > > Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are > deprecated' warnings. Ah, that reminds me! I recently tried to do an install of the cvs head (a week or two ago) on a clean installation of Debian 4.0 (etch). During the installation, of dependencies, Bio::ASN1::EntrezGene threw an error as it depends on Bioperl. I seem to remember this circular dependency cropping up before - am I correct - and can you remind me how this was "fixed"? Cheers Nath From bix at sendu.me.uk Wed Jun 27 18:52:01 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 19:52:01 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B1D1.3080206@sendu.me.uk> Nathan S. Haigh wrote: > I recently tried to do an install of the cvs head (a week or two ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up before > - am I correct - and can you remind me how this was "fixed"? Yes, it always happens. It was 'fixed' by being completely ignored by me. Installation is guaranteed to fail, but if you really want it, trying to install again after you already have Bioperl installed will result in success. Clearly something nicer could be done. Suggestions on a postcard... From cjfields at uiuc.edu Wed Jun 27 19:01:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:01:01 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B000.2050707@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > Sendu Bala wrote: >> ... >> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >> deprecated' warnings. > > Ah, that reminds me! > > I recently tried to do an install of the cvs head (a week or two > ago) on > a clean installation of Debian 4.0 (etch). During the installation, of > dependencies, Bio::ASN1::EntrezGene threw an error as it depends on > Bioperl. I seem to remember this circular dependency cropping up > before > - am I correct - and can you remind me how this was "fixed"? > > Cheers > Nath Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of Bioperl (and he could be come a dev). That would solve it. chris From n.haigh at sheffield.ac.uk Wed Jun 27 19:16:40 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 20:16:40 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> Message-ID: <4682B798.1010409@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote: > >> Sendu Bala wrote: >>> ... >>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are >>> deprecated' warnings. >> >> Ah, that reminds me! >> >> I recently tried to do an install of the cvs head (a week or two ago) on >> a clean installation of Debian 4.0 (etch). During the installation, of >> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on >> Bioperl. I seem to remember this circular dependency cropping up before >> - am I correct - and can you remind me how this was "fixed"? >> >> Cheers >> Nath > > Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of > Bioperl (and he could be come a dev). That would solve it. > > chris Just to put the feelers out to see what people think. It seems (to me at least) that Bioperl modules could/should? be released as individual modules and that "bioperl" would really constitute a "bundle" of all these modules - in terms of CPAN anyway. Am I correct in this thinking? The Bio::ASN1::EntrezGene could simply require a particular module rather than the whole of bioperl - might get out of the circular dependency theoretically!? I'm not suggesting moving in this direction, but just wondered what others thought about this concept? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X tOFQUQ7cGJLUITEDw1+QLxc= =Yc+g -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 19:31:44 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 14:31:44 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu> On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote: > ... > > Just to put the feelers out to see what people think. > > It seems (to me at least) that Bioperl modules could/should? be > released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I > correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? > > I'm not suggesting moving in this direction, but just wondered what > others thought about this concept? > > Nath Well, Steve suggested splitting some of core into distinct groups, which I tend to agree with in some respects (speed up releases for those modules, such as SearchIO, DB, Graphics). The problem we have yet to solve is what we consider 'core'. Is it Bio::Seq and related? Should it include Bio::DB*? Should it just be Bio::* modules with no or very few external dependencies? And so on..., probably not a decision we want to make immediately (until after svn migration, tests finished, maybe a release or two, a beer)... The Bioperl module dependency that Bio::ASN1::EntrezGene has is Bio::Index::AbstractSeq. You could try a test build of Bio::ASN1::EntrezGene to see what happens. chris From hlapp at gmx.net Wed Jun 27 19:49:15 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:49:15 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 1:27 PM, David Messina wrote: > I would think we would want "Author Date Id Rev URL" set on > everything, no?. So either cvs2svn or your tool (whichever you think > is better), followed by > > svn propset svn:keywords "Author Date Id Rev URL" * Shouldn't this be done recursively? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 19:50:27 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 16:50:27 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > Most projects make a clean break with cvs (no more commits) for the > reasons you point out. Not sure how the other core devs feel about > that but I could go for that; it would def. prevent headaches. There shouldn't be any cvs write support after the cut-over I think. I don't see the benefit that would justify the huge headache potential. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 20:01:40 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:01:40 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I > think. I don't see the benefit that would justify the huge headache > potential. > > -hilmar Agreed, so maybe we should set that in stone. That means no svn->cvs syncing post-migration as well, I assume. Now how about a quick straw poll, what kind of access? svn+ssh is already available, but some (Aaron among them) have indicated they would like https as well (not sure how involved it would be to set up). chris From hlapp at gmx.net Wed Jun 27 20:08:40 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:08:40 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > That means no svn->cvs syncing post-migration as well, I assume. That's a bit of a different story. People out there have URL links into our anonymous CVS repository. If it's not too troublesome (and tend to I think it's not) I'd like to maintain those in working order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi script that maps between the URL flavors (i.e., that maps a CVS-style URL to the equivalent SVN link). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Wed Jun 27 20:15:10 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 16:15:10 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18050.50510.84363.355034@almost.alerce.com> David Messina writes: > > [Chris] > > > > I managed to get it working using file://. Haven't tried svn+ssh yet > > but I've had persistent problems getting ssh to work properly on my > > macbook; not sure why yet but I haven't had time to play around > > with it. > > I just did a checkout and a test commit, both via svn+ssh -- works > great for me. Is there anyone working outside of bioperl-{run,live,ext}? g. From bix at sendu.me.uk Wed Jun 27 20:22:13 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 21:22:13 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682B798.1010409@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> Message-ID: <4682C6F5.4020406@sendu.me.uk> Nathan S. Haigh wrote: > It seems (to me at least) that Bioperl modules could/should? be released > as individual modules and that "bioperl" would really constitute a > "bundle" of all these modules - in terms of CPAN anyway. Am I correct in > this thinking? The Bio::ASN1::EntrezGene could simply require a > particular module rather than the whole of bioperl - might get out of > the circular dependency theoretically!? No, it wouldn't. The 'problem' only arises because the user is /choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same time. So even if Bioperl was released as separate modules there would still be that 'bundle' and users would still choose to do the same thing: install all the Bioperl modules as well as all its /optional/ recommended modules. And there lies the problem: Bio::ASN1::EntrezGene requires Bioperl modules, and one Bioperl module requires Bio::ASN1::EntrezGene, so the circularity isn't solved. (FYI: Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq Bio::Index::AbstractSeq requires a couple of Bioperl modules, including Bio::Root::Root Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of Bioperl modules, including Bio::Root::Root. ) You only avoid circularity by choosing not to install everything in one go. Which is something you can do right now with no problems. From n.haigh at sheffield.ac.uk Wed Jun 27 20:24:18 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 21:24:18 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> Message-ID: <4682C772.5070502@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hilmar Lapp wrote: > On Jun 27, 2007, at 12:18 PM, Chris Fields wrote: > >> Most projects make a clean break with cvs (no more commits) for the >> reasons you point out. Not sure how the other core devs feel about >> that but I could go for that; it would def. prevent headaches. > > There shouldn't be any cvs write support after the cut-over I think. > I don't see the benefit that would justify the huge headache potential. > > -hilmar I agree. A clean switch from cvs read/write to svn read/write plus cvs read only sounds the least problematic! However, how will links to cvs be dealt with? Links on Bioperl could be switched over to point to svn, but what about possible links from external sources? Maybe a more generic approach of redirection could work? Or a simple warning page stating the fact that we have moved from cvs to svn and provide a common link to follow? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y 81KurFwJlRtYFxSmLZP56Sk= =pp7b -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 20:30:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:30:19 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > Cool - this works for me. One thing I notice is that in cvs log you see which version is in which branch which is useful to answer user queries that might be a version problem. svn log doesn't seem to want to show that. Does anyone have ideas for how to do this in svn? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Jun 27 20:32:18 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 17:32:18 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4682C772.5070502@sheffield.ac.uk> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <4682C772.5070502@sheffield.ac.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote: > However, how will links to cvs be dealt with? Well I said before that probably one can write a couple of lines of Perl to write a cgi script that returns the appropriate redirect URL with a redirect status code. -hilmar - -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y f6sJ/ngeKEGpKHgyAHM1DAA= =8n0E -----END PGP SIGNATURE----- From cjfields at uiuc.edu Wed Jun 27 20:50:11 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:50:11 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote: > > On Jun 26, 2007, at 5:21 PM, George Hartzell wrote: > >> >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl >> > > Cool - this works for me. > > One thing I notice is that in cvs log you see which version is in > which branch which is useful to answer user queries that might be a > version problem. svn log doesn't seem to want to show that. Does > anyone have ideas for how to do this in svn? > > -hilmar We prob. should move it to a new directory ASAP which george can write to when he needs to update. cvs is in /home/repository/ bioperl, so maybe something similar, like /home/svn/repository/bioperl? chris From cjfields at uiuc.edu Wed Jun 27 20:51:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:51:37 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net> Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu> On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 5:01 PM, Chris Fields wrote: > >> That means no svn->cvs syncing post-migration as well, I assume. > > That's a bit of a different story. People out there have URL links > into our anonymous CVS repository. If it's not too troublesome (and > tend to I think it's not) I'd like to maintain those in working > order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi > script that maps between the URL flavors (i.e., that maps a CVS- > style URL to the equivalent SVN link). > > -hilmar I'll try getting a wiki page up as a checklist for this, including what direction we're heading in, ideas (your list and CGI redirect ideas, svn::ignore issues, etc). Dave has already started on the 'getting bioperl using svn' wiki page. If we intend to sync cvs with svn we need to find the right tools or at least check for other projects which have done something similar. I haven't googled on that yet but I'll attempt to tonight. chris From cjfields at uiuc.edu Wed Jun 27 20:53:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 15:53:08 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> bioperl-run also. I think the run CVS repo has some binary files, so if there are any problems with cvs2svn it'll be there. chris On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > George, > > bioperl-db and bioperl-network should be included, I think. > > Brian O > > > On 6/27/07 4:15 PM, "George Hartzell" wrote: > >> David Messina writes: >>>> [Chris] >>>> >>>> I managed to get it working using file://. Haven't tried svn >>>> +ssh yet >>>> but I've had persistent problems getting ssh to work properly on my >>>> macbook; not sure why yet but I haven't had time to play around >>>> with it. >>> >>> I just did a checkout and a test commit, both via svn+ssh -- works >>> great for me. >> >> Is there anyone working outside of bioperl-{run,live,ext}? >> >> g. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Jun 27 21:05:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 22:05:50 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682C6F5.4020406@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> Message-ID: <4682D12E.3000803@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> It seems (to me at least) that Bioperl modules could/should? be released >> as individual modules and that "bioperl" would really constitute a >> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >> this thinking? The Bio::ASN1::EntrezGene could simply require a >> particular module rather than the whole of bioperl - might get out of >> the circular dependency theoretically!? > > No, it wouldn't. [snip] > You only avoid circularity by choosing not to install everything in one > go. Errr... I take that back. Since CPAN bundles install things in a certain order, you just have to make sure that everything Bio::ASN1::EntrezGene needs is installed first, then Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. But the main problem with this approach is that maintenance, global-style code improvements and releases become a nightmare. I could, perhaps, imagine a scenario where the repository stayed as-is (one monolithic collection), but the dist action of Build.PL could be altered to generate a release package per module instead of one big release package of all modules, as is currently the case. Is there much value in doing that? Does anyone want me to look into the feasibility of such a thing? From bosborne11 at verizon.net Wed Jun 27 20:19:47 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 27 Jun 2007 16:19:47 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18050.50510.84363.355034@almost.alerce.com> Message-ID: George, bioperl-db and bioperl-network should be included, I think. Brian O On 6/27/07 4:15 PM, "George Hartzell" wrote: > David Messina writes: >>> [Chris] >>> >>> I managed to get it working using file://. Haven't tried svn+ssh yet >>> but I've had persistent problems getting ssh to work properly on my >>> macbook; not sure why yet but I haven't had time to play around >>> with it. >> >> I just did a checkout and a test commit, both via svn+ssh -- works >> great for me. > > Is there anyone working outside of bioperl-{run,live,ext}? > > g. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Wed Jun 27 21:25:53 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 22:25:53 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <4682D5E1.2030507@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get out of >>> the circular dependency theoretically!? >> >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything in >> one go. > > Errr... I take that back. Since CPAN bundles install things in a certain > order, you just have to make sure that everything Bio::ASN1::EntrezGene > needs is installed first, then Bio::ASN1::EntrezGene, then > Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, > global-style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be altered > to generate a release package per module instead of one big release > package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into the > feasibility of such a thing? I think the value would be in other external modules being able to use bioperl modules with more ease (not sure how many modules have, or currently depend on bioperl) as they would depend on a single module, rather than the whole package. However, how would the dependencies of each module be handled? I'm clearly thinking aloud, but....Maybe this would tease apart "cliques" of modules that are interdependent? and could in themselves be shipped as bundles e.g. Bio::Graphics and have a "master" bioperl bundle that installa all the bioperl modules. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB 2EZjccEFEzfFlx4H47gzwLk= =nobl -----END PGP SIGNATURE----- From hlapp at gmx.net Wed Jun 27 21:35:28 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 27 Jun 2007 18:35:28 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Is there a reason not to port every subproject over? -hilmar On Jun 27, 2007, at 5:53 PM, Chris Fields wrote: > bioperl-run also. I think the run CVS repo has some binary files, so > if there are any problems with cvs2svn it'll be there. > > chris > > On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote: > >> George, >> >> bioperl-db and bioperl-network should be included, I think. >> >> Brian O >> >> >> On 6/27/07 4:15 PM, "George Hartzell" wrote: >> >>> David Messina writes: >>>>> [Chris] >>>>> >>>>> I managed to get it working using file://. Haven't tried svn >>>>> +ssh yet >>>>> but I've had persistent problems getting ssh to work properly >>>>> on my >>>>> macbook; not sure why yet but I haven't had time to play around >>>>> with it. >>>> >>>> I just did a checkout and a test commit, both via svn+ssh -- works >>>> great for me. >>> >>> Is there anyone working outside of bioperl-{run,live,ext}? >>> >>> g. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Jun 27 21:36:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:36:29 -0500 Subject: [Bioperl-l] Splits again, formerly Test overhaul complete In-Reply-To: <4682D12E.3000803@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> It seems (to me at least) that Bioperl modules could/should? be >>> released >>> as individual modules and that "bioperl" would really constitute a >>> "bundle" of all these modules - in terms of CPAN anyway. Am I >>> correct in >>> this thinking? The Bio::ASN1::EntrezGene could simply require a >>> particular module rather than the whole of bioperl - might get >>> out of >>> the circular dependency theoretically!? >> No, it wouldn't. > [snip] >> You only avoid circularity by choosing not to install everything >> in one go. > > Errr... I take that back. Since CPAN bundles install things in a > certain order, you just have to make sure that everything > Bio::ASN1::EntrezGene needs is installed first, then > Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene. > > But the main problem with this approach is that maintenance, global- > style code improvements and releases become a nightmare. I could, > perhaps, imagine a scenario where the repository stayed as-is (one > monolithic collection), but the dist action of Build.PL could be > altered to generate a release package per module instead of one big > release package of all modules, as is currently the case. > > Is there much value in doing that? Does anyone want me to look into > the feasibility of such a thing? Not for the time being, at least in my opinion. Too much on our plate at this point with svn migration, test conversion, bugzilla running over (next point of attack!), etc. Maybe something to think about after, though I like the idea of a few splits to core as Steve suggested (SearchIO, Graphics, some LWP-related DB modules). My (albeit extreme) thought is to have a lean-and-mean set of 'core' modules with as few external dependencies as possible, which could work around the circular dependency issue in this case: dep.on dep.on Bio::Auxiliary -----> ASN1::EntrezGene -----> core (with EntrezGene) (basic SeqIO, Index, DB, etc) \---->------>--- dep.on ->----->----->----/ Bioperl auxiliary modules would list core as a required dependency along with anything else needed for that particular aux. section (i.e. XML parsers, LWP, GD, etc.). The whole mess, if needed, would be installed using Bundle::BioPerl or similar, with no part released w/o testing on the whole 'base' to ensure proper interaction. If a fix needed to be made in one set, make the fix, test against bioperl 'base' as a whole, and release when possible. No need to wait for a full-fledged 1.5.3 release. Maybe wishful thinking... chris From cjfields at uiuc.edu Wed Jun 27 21:44:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:44:47 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> We should port them all, yes. chris On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > Is there a reason not to port every subproject over? > > -hilmar From cjfields at uiuc.edu Wed Jun 27 21:53:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 16:53:02 -0500 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: >> ... >> Is there much value in doing that? Does anyone want me to look >> into the >> feasibility of such a thing? > > > I think the value would be in other external modules being able to use > bioperl modules with more ease (not sure how many modules have, or > currently depend on bioperl) as they would depend on a single module, > rather than the whole package. However, how would the dependencies of > each module be handled? I'm clearly thinking aloud, but....Maybe this > would tease apart "cliques" of modules that are interdependent? and > could in themselves be shipped as bundles e.g. Bio::Graphics and > have a > "master" bioperl bundle that installa all the bioperl modules. See my response to Sendu, and Steve Chervitz's original post and related thread: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 which pretty much covers the same ground. I think at most 4-5 split 'cliques', including core, with the fewest possible dependencies in core. If we do any of this, it prob. should wait until after an svn migration and bugzilla bug stomping unless there is a (well-argued) advantage to doing it now. chris From n.haigh at sheffield.ac.uk Wed Jun 27 22:07:31 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 27 Jun 2007 23:07:31 +0100 Subject: [Bioperl-l] Test overhaul complete In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <4682D5E1.2030507@sheffield.ac.uk> <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu> Message-ID: <4682DFA3.9090100@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > > On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote: > >>> ... >>> Is there much value in doing that? Does anyone want me to look into the >>> feasibility of such a thing? >> >> >> I think the value would be in other external modules being able to use >> bioperl modules with more ease (not sure how many modules have, or >> currently depend on bioperl) as they would depend on a single module, >> rather than the whole package. However, how would the dependencies of >> each module be handled? I'm clearly thinking aloud, but....Maybe this >> would tease apart "cliques" of modules that are interdependent? and >> could in themselves be shipped as bundles e.g. Bio::Graphics and have a >> "master" bioperl bundle that installa all the bioperl modules. > > See my response to Sendu, and Steve Chervitz's original post and related > thread: > > http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315 > > which pretty much covers the same ground. I think at most 4-5 split > 'cliques', including core, with the fewest possible dependencies in > core. If we do any of this, it prob. should wait until after an svn > migration and bugzilla bug stomping unless there is a (well-argued) > advantage to doing it now. > > chris That's fine by me - or should I say, the best way forward - I was really just thinking aloud :) Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix TSi/e8PtYTwpxn6x+ewrjBs= =7Vp1 -----END PGP SIGNATURE----- From bix at sendu.me.uk Wed Jun 27 22:43:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 27 Jun 2007 23:43:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> Message-ID: <4682E824.1050507@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >> But the main problem with this approach is that maintenance, global- >> style code improvements and releases become a nightmare. I could, >> perhaps, imagine a scenario where the repository stayed as-is (one >> monolithic collection), but the dist action of Build.PL could be >> altered to generate a release package per module instead of one big >> release package of all modules, as is currently the case. >> >> Is there much value in doing that? Does anyone want me to look into >> the feasibility of such a thing? > > Not for the time being, at least in my opinion. Too much on our > plate at this point with svn migration, test conversion, bugzilla > running over (next point of attack!), etc. Maybe something to think > about after, though I like the idea of a few splits to core as Steve > suggested (SearchIO, Graphics, some LWP-related DB modules). [snip] > If a fix needed to be made in one set, make the fix, test against > bioperl 'base' as a whole, and release when possible. No need to > wait for a full-fledged 1.5.3 release. What advantage is there of these defined splits instead of individual modules? As I see it you lose some of the potential benefits of breaking Bioperl up completely, whilst also suffering the maintenance problems I outlined in my objection to Steve's post. Being able to work on all Bioperl from a single cvs (ne svn) check out/ archive, whilst distributing it as individual modules on CPAN seems like the best of both worlds to me. What am I missing? From hartzell at alerce.com Thu Jun 28 00:41:01 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:41:01 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net> <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu> Message-ID: <18051.925.23313.932916@almost.alerce.com> Chris Fields writes: > [...] > We prob. should move it to a new directory ASAP which george can > write to when he needs to update. cvs is in /home/repository/ > bioperl, so maybe something similar, like /home/svn/repository/bioperl? I'd be parsimonious (lazy...) and go for /home/svn/bioperl. g. From hartzell at alerce.com Thu Jun 28 00:46:29 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:46:29 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> Message-ID: <18051.1253.87485.235496@almost.alerce.com> Chris Fields writes: > [...] > Now how about a quick straw poll, what kind of access? svn+ssh is > already available, but some (Aaron among them) have indicated they > would like https as well (not sure how involved it would be to set up). What we do here, in large part, depends on what our host machine makes available to us. Is there an apache instance that we can use? Maybe a separate one? May someone among us configure it, or do we need to ask for help? (in other words, does anyone have sudo?) Is there some reason to not include http: (using Digest authentication so that passwords aren't passed in the clear?)? Maybe even go so far as to ask why bother with https:, it's not like we need to transfer any data encrypted.... g. From dmessina at wustl.edu Thu Jun 28 03:02:25 2007 From: dmessina at wustl.edu (David Messina) Date: Wed, 27 Jun 2007 22:02:25 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > >> I would think we would want "Author Date Id Rev URL" set on >> everything, no?. So either cvs2svn or your tool (whichever you think >> is better), followed by >> >> svn propset svn:keywords "Author Date Id Rev URL" * > > Shouldn't this be done recursively? Yep, good catch! Thanks, Hilmar. Should be: svn propset --recursive svn:keywords "Author Date Id Rev URL" * From jason at bioperl.org Thu Jun 28 03:29:09 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:29:09 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.1253.87485.235496@almost.alerce.com> References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: I think Chris D and I will need to confer a bit on https+svn. I don't know when we'll have a good chance to discuss everything. At some point this discussion is may need to be taken off bioperl and just the interested parties as we're delving into hardware geek land. The repository machine (dev) is a locked down machine meaning it only really runs ssh and not many servers include httpd. We have anonymous CVS (client and through httpd browsing) running on a separate machine (code) that has the info rsynced over every 10 or 15 minutes. The foundation websites and mailing lists run on a third machine (portal). If we decide to support https we'll need to spend a little time deciding how well we can keep it locked down - it will only be https not http for example and we may want to see about limiting ssh access to everyone if we migrate all OBF projects over to SVN and only support https. Again to re-iterate what I think we would do: - SVN read/write will live on 'dev', _WHEN_ we switch over no writes to the CVS repository. It will be available by ssh+svn and potentially by https+svn - SVN read-only will live on 'code', it will be accessible by http+svn - CVS read-only will live on 'code', this will only be a sync from the SVN to the CVS. See http://svn2cvs.tigris.org/ for details As I tried to ask for in the past, would someone also illustrate the importance of why _WE_ need to switch to SVN on a wiki page on Bioperl so that when someone complains/asks about this in the future the arguments are already laid out. I am basically fine with it, but I don't honestly see a compelling reason beyond what has been mentioned wrt better integration in IDEs. http://bioperl.org/wiki/Why_SVN -jason On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > Chris Fields writes: >> [...] >> Now how about a quick straw poll, what kind of access? svn+ssh is >> already available, but some (Aaron among them) have indicated they >> would like https as well (not sure how involved it would be to set >> up). > > What we do here, in large part, depends on what our host machine makes > available to us. > > Is there an apache instance that we can use? Maybe a separate one? > > May someone among us configure it, or do we need to ask for help? (in > other words, does anyone have sudo?) > > Is there some reason to not include http: (using Digest authentication > so that passwords aren't passed in the clear?)? Maybe even go so far > as to ask why bother with https:, it's not like we need to transfer > any data encrypted.... > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From jason at bioperl.org Thu Jun 28 03:51:32 2007 From: jason at bioperl.org (Jason Stajich) Date: Thu, 28 Jun 2007 00:51:32 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Hey guys - I'm wading in a bit late as I haven't had time to keep up with whole discussion. So you are suggesting 800+ individual CPAN modules? I don't think that is a good idea. Why would you split up Bio::Seq::RichSeq and Bio::Seq into two separate packages for example? I think if you really want to move away from the monolithic install it has to be more logical by function - but I am not that optimistic that this is going to actually be easier for people. Maybe I'm misunderstanding. What are the arguments for separating things -- to make it so people aren't scared by the number of modules so they'll code? It seems like some people just want it to be installed and run scripts - does having them install dozens of modules work. Do we need to consider people how much this would suck if someone can't use CPAN or Module::Builder to automate dependancy tracking installation? How does it work when modules are deprecated? I'm not sure I have made up my mind on what I'd like to see, but at some point I think we need to get a clearer idea of what audience we are trying to serve best. If want it to be easy to install maybe we should invest time into making OSX double-click installers, RPMs, and the Windows stuff easily installable. If we want to serve the developers who aren't using SVN so we want to push out releases of modules ASAP? I just am not clear on the motivation for some of the proposed changes. Also - the main point I wanted to make - Can I suggest we spend a little time discussing what it will take to get a stable release for the current code as it stands (bioperl-live and bioperl-run)? It seems like we really need to do this first so that we have a stable release that can be followed by CVS -> SVN migration, then consider major changes to the repository structure and release packaging, and potential deprecation and incorporation of other modules. I assume there is no chance that we'd have a 1.6 candidate by BOSC next month? Will it be productive to schedule a fair amount of time at BOSC discussing how to partition out the packages into separate sub- packages after we've done a successful release rather than trying to change things right now? I realize not everyone will be there but maybe it will be easier to interact on this then. I think it will also be time to talk with Lincoln/Scott about how Gbrowse is structured and if that is working for them. There is too much code in different places that I think we need to figure out how to structure it properly so those packages can be released. It would probably mean moving Bio::Graphics, Bio::DB::GFF and Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages so they could be released more regularly on par with Gbrowse schedules. Also I think someone needs to figure out Bio::Tools::GFF vs Bio::FeatureIO -- what do we want to do? I don't think we really fully support GFF3 that well -- the X2GFF scripts probably need some more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, etc... ) and or migration to the proper GFF writing. -jason On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote: >>> But the main problem with this approach is that maintenance, global- >>> style code improvements and releases become a nightmare. I could, >>> perhaps, imagine a scenario where the repository stayed as-is (one >>> monolithic collection), but the dist action of Build.PL could be >>> altered to generate a release package per module instead of one big >>> release package of all modules, as is currently the case. >>> >>> Is there much value in doing that? Does anyone want me to look into >>> the feasibility of such a thing? >> >> Not for the time being, at least in my opinion. Too much on our >> plate at this point with svn migration, test conversion, bugzilla >> running over (next point of attack!), etc. Maybe something to think >> about after, though I like the idea of a few splits to core as Steve >> suggested (SearchIO, Graphics, some LWP-related DB modules). > [snip] >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of individual > modules? As I see it you lose some of the potential benefits of > breaking > Bioperl up completely, whilst also suffering the maintenance > problems I > outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ > archive, whilst distributing it as individual modules on CPAN seems > like > the best of both worlds to me. What am I missing? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org http://jason.open-bio.org/ From chris at bioteam.net Thu Jun 28 04:08:25 2007 From: chris at bioteam.net (Chris Dagdigian) Date: Thu, 28 Jun 2007 00:08:25 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net> My understanding of "https+svn" is that it is actually WebDAV-over- HTTP which means that not only would we need to light up a HTTPD server on the developer box we'd also have to get a stable mod_dav module installed (sometimes not trivial) and then we would have to figure out how to handle the authentication bits. Right now with SSH we use Unix group permissions to figure out who can write to what repository -- WebDAV makes this a lot more complicated. Forcing encryption over https will prevent someone from sniffing a developer password which removes the main security issue. The next problem is going to be integrating the DAV module with Linux PAM so that existing usernames and passwords can be used, -OR- we have to set up and maintain an entirely separate set of username and password maps for each developer and each SVN project. I'm not super concerned about this -- BioTeam runs svn internally and we expose our SVN for employees both via WebDAV and SVN+SSH - it's not that hard to set up. My biggest concern really has to do with how much extra work this will mean for the OBF sysadmin team. If there is an easy way to get a stable Apache/DAV/SVN integration going with authentication coming from Linux PAM then this is no big deal. If we have to manually maintain separate authentication lists then it will be kind of a hassle. Like Jason mentioned, the OBF currently segregates "stuff" onto three different servers with three levels of security: - dev.open-bio.org -- Developers only, SSH access only (main sourcecode repository for OBF) - portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers and helpdesk.open-bio.org - code.open-bio.org -- "Disposable" anonymous access server that we can easily burn/wipe/reinstall if it ever gets hacked Everything else that Jason mentioned is fine and easy to set up (if not already running): - SVN+SSH for developers - Anonymous SVN and Anonymous RSYNC for community access on code.open-bio.org - svn2cvs for whomever wants it on code.open-bio.org - web based SVN code browser installed on http://code.open-bio.org Regards, Chris On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote: > I think Chris D and I will need to confer a bit on https+svn. I > don't know when we'll have a good chance to discuss everything. At > some point this discussion is may need to be taken off bioperl and > just the interested parties as we're delving into hardware geek land. > > The repository machine (dev) is a locked down machine meaning it > only really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or > 15 minutes. The foundation websites and mailing lists run on a > third machine (portal). > > > If we decide to support https we'll need to spend a little time > deciding how well we can keep it locked down - it will only be > https not http for example and we may want to see about limiting > ssh access to everyone if we migrate all OBF projects over to SVN > and only support https. > > Again to re-iterate what I think we would do: > - SVN read/write will live on 'dev', _WHEN_ we switch over no > writes to the CVS repository. It will be available by ssh+svn and > potentially by https+svn > - SVN read-only will live on 'code', it will be accessible by http > +svn > - CVS read-only will live on 'code', this will only be a sync from > the SVN to the CVS. See http://svn2cvs.tigris.org/ for details > > > As I tried to ask for in the past, would someone also illustrate > the importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the > future the arguments are already laid out. I am basically fine > with it, but I don't honestly see a compelling reason beyond what > has been mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN > > -jason > On Jun 27, 2007, at 9:46 PM, George Hartzell wrote: > >> Chris Fields writes: >>> [...] >>> Now how about a quick straw poll, what kind of access? svn+ssh is >>> already available, but some (Aaron among them) have indicated they >>> would like https as well (not sure how involved it would be to >>> set up). >> >> What we do here, in large part, depends on what our host machine >> makes >> available to us. >> >> Is there an apache instance that we can use? Maybe a separate one? >> >> May someone among us configure it, or do we need to ask for help? >> (in >> other words, does anyone have sudo?) >> >> Is there some reason to not include http: (using Digest >> authentication >> so that passwords aren't passed in the clear?)? Maybe even go so far >> as to ask why bother with https:, it's not like we need to transfer >> any data encrypted.... >> >> g. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason at bioperl.org > http://jason.open-bio.org/ > > From cjfields at uiuc.edu Thu Jun 28 04:18:03 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 27 Jun 2007 23:18:03 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4682E824.1050507@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: > Chris Fields wrote: > ... >> If a fix needed to be made in one set, make the fix, test against >> bioperl 'base' as a whole, and release when possible. No need to >> wait for a full-fledged 1.5.3 release. > > What advantage is there of these defined splits instead of > individual modules? As I see it you lose some of the potential > benefits of breaking Bioperl up completely, whilst also suffering > the maintenance problems I outlined in my objection to Steve's post. > > Being able to work on all Bioperl from a single cvs (ne svn) check > out/ archive, whilst distributing it as individual modules on CPAN > seems like the best of both worlds to me. What am I missing? Okay, forewarned, but here's my long-winded reasoning. The short and sweet version: I (very) respectfully don't agree with you, at least re: the idea we should commit all modules to CPAN independently. It doesn't make any sense to me, but maybe you can elaborate more? Maybe I'm misinterpreting what you mean? Also, I agree with Steve C. that core is anything but a representation of a 'core' set of modules, and some sections could (should?) be split off into discrete, cohesive units. We may be alone in that camp, though it doesn't seem so (it's popped up more than a few times, in one form or another). If you want an in-depth explanation for both opinions, read on (below my sig), or feel free to bypass it. I'll understand. Finally, all of this should wait until later. Much later, like after a decent release, after svn, etc kind of 'later'. I think we can agree on that. . . . . . Still here? Okay... each issue (skip as needed): Individual CPAN modules: CPAN is not our personal versioning system; it may be if a distribution consists of only a few modules, but not when it's one of the largest distros present. If someone wants to update an individual bioperl module for a quick bug fix they are more than welcome to download it via cvs, svn, or even using a web browser, and replace the one they have. In most cases, it works w/o problems. With Module::Build you have even made it easier if a full installation is necessary. I'm trying to reason how one could break up the individual SeqIO/ SearchIO/otherIO modules into single module distributions. They are intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, which relies on the various interfaces, RootIO, and on down). How would tests be run off CPAN when the modules are distributed independently? Would they also be individually distributed? What would you use to tie all the individual modules together? How would you explain to the CPAN maintainers that you want to split bioperl into 990 individual modules, all updated independently, but intend on bundling them afterwards anyway? I'm failing to see the advantages to this approach, but if you can find an example where this was done successfully on CPAN or elsewhere maybe I could see what you mean. Splitting up core: As I see it, here are the advantages of a defined split as Steve and I see it (off the top of my head). Some of this probably reiterates my previous points, as well as Steve's, so apologies in advance. - A lean, mean, focused set of bioperl base modules (core) w/o or with very few external deps, minimal installation issues, etc. The very basic stuff to get up and running. - BioPerl bundled modules (Nathan's 'cliques') with defined, focused functionality, code, and tests, which add a bit more 'sugar' to the base functionality of the core. If you only care about parsing BLAST reports, get SearchIO, which requires core and optionally other modules (XML::SAX). If you want additional DB functionality apart from the very basic ones in core, install DB (with it's additional requirements, including core, DBI, and so on). Same with Graphics, Tools, Tree/Phylo, etc. We just need to define and limit the number of splits. - Easier to add additional bundled modules. For instance, I could focus all of my RNA work into a discrete set of modules (say, bioperl- rna) which I maintain, I ensure works with the latest core code, I ensure also plays well with the other children =) , and I distribute via CPAN. Same with EUtilities, which could go into a separated DB- related set or stay in core. - If we want a full-fledged 'install everything', the CPAN Bundle system is available. I think it's easier to use a Bundle for 4-5, even 10 groups of modules as opposed to over 900. - A Bundle or a build file where discrete distributions are listed (Bio::SearchIO, etc) wouldn't need to be updated every time a new module is added to a distribution. I suppose this could be automated, but why have the additional headache? - A chance to cut out some cruft. We all know that particular areas need work or a complete overhaul (Restriction, Structure, maybe a few others). Smaller, concentrated sets of modules I believe would be easier to maintain, and those that don't get use will eventually fall out of favor and may be lost or replaced from the more maintained group of modules. Survival of the fittest. - We already have had practice; bioperl-db, bioperl-run, bioperl- network, and others. Those that have been routinely maintained and enjoy wide use (db, run, network) have survived; others not so much (corba-related stuff, microarray, ext, etc., though the code is still available if someone else wants to take it up and revive it!). Disadvantages of a defined split: - The initial headache of identifying which groups go where, coordinating with those who rely on bioperl (GMOD, etc) on how this will be set up, so on... - Separate groups of modules require testing together to ensure functionality is consistent and maintained (something I think you pointed out previously). - I think an increased possibility of branching is possible. - Extra headaches for devs, who have to keep track of the various critical distributions and make sure they work well together. - Maybe others, but it's getting late here. Add more as needed; I'm sure there are a number more. chris From cjfields at uiuc.edu Thu Jun 28 05:17:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 00:17:01 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu> D'oh! Just when I wanted to go to bed. It's not fair, you're in California... On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote: > Hey guys - I'm wading in a bit late as I haven't had time to keep up > with whole discussion. > > So you are suggesting 800+ individual CPAN modules? I don't think > that is a good idea. Why would you split up Bio::Seq::RichSeq and > Bio::Seq into two separate packages for example? I think if you > really want to move away from the monolithic install it has to be > more logical by function - but I am not that optimistic that this is > going to actually be easier for people. Maybe I'm misunderstanding. Okay, so maybe it wasn't just me. > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? What I envision for core is maybe not just one distribution, but a cluster of distributions: base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated modules. Bare bones, with as few dependencies as possible. aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires additional modules. search - Bio::Search and SearchIO tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related stuff? graphics - Bio::Graphics. Maybe GMOD-related stuff here? The last four would list bioperl-core as a dependency themselves along with any other modules necessary. We could also have the core Build.PL ask the user if they want to install the other non-base distros, and maybe include bioperl-db, bioperl-network, and bioperl- run in the loop if requested. All would be installed as a bundle similar to Bundle::BioPerl, but have regular CPAN point releases (1.x.x) independently from one another i.e. for bug fixes, with a yearly/biyearly timed full release (1.x) of the whole shebang. Any point release for any 'core' distribution would have to be tested against the others prior to release. This is basically following Steve's train of thought, though more elaborated: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ focus=15315 > I'm not sure I have made up my mind on what I'd like to see, but at > some point I think we need to get a clearer idea of what audience we > are trying to serve best. If want it to be easy to install maybe we > should invest time into making OSX double-click installers, RPMs, and > the Windows stuff easily installable. If we want to serve the > developers who aren't using SVN so we want to push out releases of > modules ASAP? I just am not clear on the motivation for some of the > proposed changes. I think regular CPAN releases with updated PPMs hosted via portal work fine for the most part, but it would be nice to host RPMs. Others (Allen Day, for instance) have donated time to generate RPMs but they seem to lag behind a bit more. The original idea for svn arose from an unrelated thread with Mark Johnson discussing something (Glimmer maybe?) and took off from there. I was actually pretty surprised it took on a life of it's own. As for the motivation to switch, I haven't specifically used it myself, but the large number of responses seem to indicate others have and seem happy with it. Rutger Vos had also indicated he would move Bio::Phylo over to the repo if we used svn. We def. should address the issues you bring up (why _WE_ need svn) more succinctly but that shouldn't be an issue. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. Agreed. We prob. need to schedule a good couple of days (or so) to squash bugs. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? Um, not likely as nothing has been addressed Feature/Annotation-wise (overloads are still there, methods have not been deprecated, etc). There was an underlying assumption these would have an effect on GMOD- related stuff (I remember reading a post from Scott Cain in the mail archive mentioning something along these lines after the 1.5 release hubbub). Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall? > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I realize not everyone will be there but > maybe it will be easier to interact on this then. How many are going to be there? I can't go this year except on my own dime (which I don't have many of, student loans and all, sorry), though I'll likely be in a new lab by spring which is likely more amenable to funding. If there is a hackathon in the late fall (post- sept) I'll make it a point to go regardless. > I think it will also be time to talk with Lincoln/Scott about how > Gbrowse is structured and if that is working for them. There is too > much code in different places that I think we need to figure out how > to structure it properly so those packages can be released. It would > probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I don't think we really > fully support GFF3 that well -- the X2GFF scripts probably need some > more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL, > etc... ) and or migration to the proper GFF writing. > > > -jason Will Lincoln or Scott be at BOSC? chris From dmessina at wustl.edu Thu Jun 28 05:21:58 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 00:21:58 -0500 Subject: [Bioperl-l] finding statistics on AA In-Reply-To: <4681F4B4.8010609@pacific.net.sg> References: <4681F4B4.8010609@pacific.net.sg> Message-ID: Hi Melvin, I don't think BioPerl has any information content-related code. I'm not terribly familiar with it myself, but the usual recommendation is to look at the EMBOSS package: http://en.wikipedia.org/wiki/EMBOSS Dave From bix at sendu.me.uk Thu Jun 28 06:38:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 07:38:48 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <46835778.5070901@sendu.me.uk> Jason Stajich wrote: > So you are suggesting ou are suggesting 800+ individual CPAN modules? > I don't think that is a good idea. Why would you split up > Bio::Seq::RichSeq and Bio::Seq into two separate packages for > example? I think if you really want to move away from the monolithic > install it has to be more logical by function - but I am not that > optimistic that this is going to actually be easier for people. > Maybe I'm misunderstanding. > > What are the arguments for separating things -- to make it so people > aren't scared by the number of modules so they'll code? It seems > like some people just want it to be installed and run scripts - does > having them install dozens of modules work. Do we need to consider > people how much this would suck if someone can't use CPAN or > Module::Builder to automate dependancy tracking installation? How > does it work when modules are deprecated? See my upcoming reply to Chris. Briefly, if the only change is to the dist action of Build.PL, we can make a single archive of all modules available to non-CPAN users, and individual modules available to CPAN users. No problems. > Also - the main point I wanted to make - Can I suggest we spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I'd recommend that a 'stable' release shouldn't happen until we resolve all the missing tests and bugzilla bugs (because I think the opportunity should be taken to have it stable both in terms of interface /and/ bugs). Which is a lot of work. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? None. From bix at sendu.me.uk Thu Jun 28 07:25:03 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 08:25:03 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> Message-ID: <4683624F.6020402@sendu.me.uk> Chris Fields wrote: > On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >> What advantage is there of these defined splits instead of >> individual modules? As I see it you lose some of the potential >> benefits of breaking Bioperl up completely, whilst also suffering >> the maintenance problems I outlined in my objection to Steve's post. >> >> Being able to work on all Bioperl from a single cvs (ne svn) check >> out/ archive, whilst distributing it as individual modules on CPAN >> seems like the best of both worlds to me. What am I missing? > > Okay, forewarned, but here's my long-winded reasoning. The short and > sweet version: I (very) respectfully don't agree with you, at least > re: the idea we should commit all modules to CPAN independently. It > doesn't make any sense to me, but maybe you can elaborate more? > Maybe I'm misinterpreting what you mean? The short and sweet version: my proposal has all the benefits of yours, but none of the disadvantages. What's not to like? > Finally, all of this should wait until later. Much later, like after > a decent release, after svn, etc kind of 'later'. I think we can > agree on that. Hmm, not really. If it can be implemented by a change in just Build.PL and ModuleBuildBioperl, its really independent of everything else. That's the beauty of it: the only thing that changes is how things are uploaded to and downloaded from CPAN. The only person that normally deals with that issue is the pumpkin for a release, and he only cares about it at release time. In fact, if we're going to do it at all it makes sense to try it out on a minor release like 1.5.3. We've already got experience of doing it split-style from 1.5.2. (And let me tell you: splits at the code-base level suck.) > Individual CPAN modules: > > CPAN is not our personal versioning system; it may be if a > distribution consists of only a few modules, but not when it's one of > the largest distros present. If someone wants to update an > individual bioperl module for a quick bug fix they are more than > welcome to download it via cvs, svn, or even using a web browser, and > replace the one they have. And where is the harm in letting them do it via CPAN as well? In fact, there are significant benefits: > I'm trying to reason how one could break up the individual SeqIO/ > SearchIO/otherIO modules into single module distributions. They are > intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, > which relies on the various interfaces, RootIO, and on down). How > would tests be run off CPAN when the modules are distributed > independently? Bio::SeqIO::genbank would have a dependency on the latest version of Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. So when a user wants to get the latest version of Bio::SeqIO::genbank, they no longer have to worry about what other modules in its dependency hierarchy they should also install. Instead they just request Bio::SeqIO::genbank which itself ensures you have the latest version of all its dependencies before installing itself and running its tests. When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank users should have, he could just call './Build dist Bio::SeqIO::genbank' which would generate a new package for Bio::SeqIO::genbank suitable for uploading to CPAN. No more long release cycles and having to constantly tell people to 'use CVS' to get working Bioperl code. > Would they also be individually distributed? What > would you use to tie all the individual modules together? How would > you explain to the CPAN maintainers that you want to split bioperl > into 990 individual modules, all updated independently, but intend on > bundling them afterwards anyway? They would be tied together by a CPAN bundle. You don't have to 'explain' anything to the CPAN maintainers because you're not doing anything wrong. In fact, you're using it the way you're supposed to. > Splitting up core: > > As I see it, here are the advantages of a defined split as Steve and > I see it (off the top of my head). Some of this probably reiterates > my previous points, as well as Steve's, so apologies in advance. Below I answer with how it would be with my single-module approach compared to the defined splits. > - A lean, mean, focused set of bioperl base modules (core) w/o or > with very few external deps, minimal installation issues, etc. The > very basic stuff to get up and running. Even leaner, even more focused. > - BioPerl bundled modules (Nathan's 'cliques') with defined, focused > functionality, code, and tests, which add a bit more 'sugar' to the > base functionality of the core. If you only care about parsing BLAST > reports, get SearchIO, which requires core and optionally other > modules (XML::SAX). If you want additional DB functionality apart > from the very basic ones in core, install DB (with it's additional > requirements, including core, DBI, and so on). Same with Graphics, > Tools, Tree/Phylo, etc. We just need to define and limit the number > of splits. The same can be achieved with CPAN bundles for each kind of functional grouping you can think of. And since its just a single text file that defines such a grouping, its easy to change or add new ones as you feel like it, as opposed to the rather more permanent and substantial effort of creating one of your splits on the code-base level. Also, the world doesn't have to rely on /our/ ideas of what a useful functional split is. If someone just wants to parse Blast results, they can just use CPAN to install Bio::SearchIO::blast_pull instead of having to install all of SearchIO. > - Easier to add additional bundled modules. For instance, I could > focus all of my RNA work into a discrete set of modules (say, bioperl- > rna) which I maintain, I ensure works with the latest core code, I > ensure also plays well with the other children =) , and I distribute > via CPAN. Same with EUtilities, which could go into a separated DB- > related set or stay in core. And if you lose interest in them? They eventually die because they no longer have someone looking after them by default (the pumpkin and other devs). Alternatively you could just make a CPAN bundle. One text file! Easy! No duplication of modules in CPAN, no new hassle for you or the Bioperl 'core' pumpkin to ensure that the latest version of each work with each other and other splits. > - If we want a full-fledged 'install everything', the CPAN Bundle > system is available. I think it's easier to use a Bundle for 4-5, > even 10 groups of modules as opposed to over 900. No, it isn't any easier. Its /equally/ easy to install a bundle of 900 packages of 900 modules as it is to install 5 packages of 900 modules. When not installing absolutely everything, but perhaps 'most' things, there's the additional benefit that it would be easier to skip a particular Bio::module because you didn't want to install its external dependencies and weren't that interested in it anyway. > - A Bundle or a build file where discrete distributions are listed > (Bio::SearchIO, etc) wouldn't need to be updated every time a new > module is added to a distribution. I suppose this could be > automated, but why have the additional headache? Yes, it would be automated, and no, it wouldn't at all be any kind of additional headache. I'm proposing a fully-automated system that the pumpkin wouldn't even have to think about it. Much /less/ of a headache than dealing with splits. Orders of magnitude easier to deal with. > - A chance to cut out some cruft. We all know that particular areas > need work or a complete overhaul (Restriction, Structure, maybe a few > others). Smaller, concentrated sets of modules I believe would be > easier to maintain, and those that don't get use will eventually fall > out of favor and may be lost or replaced from the more maintained > group of modules. Survival of the fittest. And the smallest, most concentrated set of modules is the individual module. > - We already have had practice; bioperl-db, bioperl-run, bioperl- > network, and others. Those that have been routinely maintained and > enjoy wide use (db, run, network) have survived; others not so much > (corba-related stuff, microarray, ext, etc., though the code is still > available if someone else wants to take it up and revive it!). The reason some of these existing splits (micoarray, ext) have fallen by the way-side? /Because/ they're splits. If they had been part of bioperl-live all along, they'd have been kept in a working, compatible state and would have been released along with everything else in 1.5.2 > Disadvantages of a defined split: > > - The initial headache of identifying which groups go where, > coordinating with those who rely on bioperl (GMOD, etc) on how this > will be set up, so on... No need to worry about this with individual modules. > - Separate groups of modules require testing together to ensure > functionality is consistent and maintained (something I think you > pointed out previously). No need to worry. > - I think an increased possibility of branching is possible. > > - Extra headaches for devs, who have to keep track of the various > critical distributions and make sure they work well together. No headaches. From charles-listes+bioperl at plessy.org Thu Jun 28 07:40:04 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 16:40:04 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? Message-ID: <20070628074004.GD6338@kunpuu.plessy.org> Dear developpers, I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if it would make sense to call it "bioperl-live" and distribute it in parallel with the stable 1.4.0 version, if bioperl-live means "the current developepr version". If I am wrong, can somebody explain me what bioperl-live exactly refers to ? Have a nice day, -- Charles Plessy Debian-med packaging team Wako, Saitama, Japan From n.haigh at sheffield.ac.uk Thu Jun 28 08:23:10 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:23:10 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46836FEE.5030203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. This was my thinking when I first brought this up at the begining/splitting of this thread. This way of thinking of modules as the constituent parts of a larger package should make it easier for people to define dependencies far easier as well as users only needing to install those parts they require. As Sendu points out, if the user wants to convert seqs from genbank to fasta they could simply install Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the other modules that are the dependencies of Bio::SeqIO::genbank and Bio::SeqIO::fasta. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. However, how would the test suite work out with this? e.g. when someone installs Bio::SeqIO::genbank they want to have the tests associated with Bio::SeqIO::genbank to be run. Would there be tests that would be run redundantly if for example someone installed Bio::SeqIO::genbank and Bio::SeqIO::fasta? > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. Yep. real modules are released as modules, each with their own set of dependencies. The use CPAN bundles the way there were supposed to be for - - distributing a set of CPAN modules that make a coherent set of functionality. You "could" also bundle in other authors modules e.g. Bio::ASN1::EntrezGene? > > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. Hmm, how would module versions be handled? Wouldn't this approach require each module to have it's own independent version number, which could then be used for building the dependencies? Each new release of that module would only bump that module's version number. Bundles can specify the minimum version of a module to be installed, such that bug fixes to individual modules and be released into CPAN and would automatically get picked up when installing bundles etc. I'm not quite sure how the current stable/dev releases would work. I assume bug fixes would have to be made on a branch e.g. branch 1.6 and released to cpan from there. Then when the next stable release is made, all module versions would be bumped and and released to CPAN. With any modifications to the content of the bundle to be made. Is it possible to have a stable and developer release bundles that are able to specify the minimum stable and developer modules versions respectively? > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. Maye need to worry aout how the tests are run when installing individual modules etc? > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT VkymyXNshguE44/RilEXWDA= =O5ex -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 08:27:54 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:27:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683710A.9010808@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > Chris Fields wrote: >> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote: >>> What advantage is there of these defined splits instead of >>> individual modules? As I see it you lose some of the potential >>> benefits of breaking Bioperl up completely, whilst also suffering >>> the maintenance problems I outlined in my objection to Steve's post. >>> >>> Being able to work on all Bioperl from a single cvs (ne svn) check >>> out/ archive, whilst distributing it as individual modules on CPAN >>> seems like the best of both worlds to me. What am I missing? >> >> Okay, forewarned, but here's my long-winded reasoning. The short and >> sweet version: I (very) respectfully don't agree with you, at least >> re: the idea we should commit all modules to CPAN independently. It >> doesn't make any sense to me, but maybe you can elaborate more? >> Maybe I'm misinterpreting what you mean? > > The short and sweet version: my proposal has all the benefits of yours, > but none of the disadvantages. What's not to like? > > >> Finally, all of this should wait until later. Much later, like after >> a decent release, after svn, etc kind of 'later'. I think we can >> agree on that. > > Hmm, not really. If it can be implemented by a change in just Build.PL > and ModuleBuildBioperl, its really independent of everything else. > That's the beauty of it: the only thing that changes is how things are > uploaded to and downloaded from CPAN. The only person that normally > deals with that issue is the pumpkin for a release, and he only cares > about it at release time. > > In fact, if we're going to do it at all it makes sense to try it out on > a minor release like 1.5.3. We've already got experience of doing it > split-style from 1.5.2. (And let me tell you: splits at the code-base > level suck.) > > >> Individual CPAN modules: >> >> CPAN is not our personal versioning system; it may be if a >> distribution consists of only a few modules, but not when it's one of >> the largest distros present. If someone wants to update an >> individual bioperl module for a quick bug fix they are more than >> welcome to download it via cvs, svn, or even using a web browser, and >> replace the one they have. > > And where is the harm in letting them do it via CPAN as well? In fact, > there are significant benefits: > > >> I'm trying to reason how one could break up the individual SeqIO/ >> SearchIO/otherIO modules into single module distributions. They are >> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, >> which relies on the various interfaces, RootIO, and on down). How >> would tests be run off CPAN when the modules are distributed >> independently? > > Bio::SeqIO::genbank would have a dependency on the latest version of > Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies. > > So when a user wants to get the latest version of Bio::SeqIO::genbank, > they no longer have to worry about what other modules in its dependency > hierarchy they should also install. > > Instead they just request Bio::SeqIO::genbank which itself ensures you > have the latest version of all its dependencies before installing itself > and running its tests. > > When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank > users should have, he could just call './Build dist Bio::SeqIO::genbank' > which would generate a new package for Bio::SeqIO::genbank suitable for > uploading to CPAN. No more long release cycles and having to constantly > tell people to 'use CVS' to get working Bioperl code. > > >> Would they also be individually distributed? What would you use to >> tie all the individual modules together? How would you explain to >> the CPAN maintainers that you want to split bioperl into 990 >> individual modules, all updated independently, but intend on bundling >> them afterwards anyway? > > They would be tied together by a CPAN bundle. You don't have to > 'explain' anything to the CPAN maintainers because you're not doing > anything wrong. In fact, you're using it the way you're supposed to. > The successor to Bundles - may prove interesting: http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm > >> Splitting up core: >> >> As I see it, here are the advantages of a defined split as Steve and >> I see it (off the top of my head). Some of this probably reiterates >> my previous points, as well as Steve's, so apologies in advance. > > Below I answer with how it would be with my single-module approach > compared to the defined splits. > > >> - A lean, mean, focused set of bioperl base modules (core) w/o or >> with very few external deps, minimal installation issues, etc. The >> very basic stuff to get up and running. > > Even leaner, even more focused. > > >> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused >> functionality, code, and tests, which add a bit more 'sugar' to the >> base functionality of the core. If you only care about parsing BLAST >> reports, get SearchIO, which requires core and optionally other >> modules (XML::SAX). If you want additional DB functionality apart >> from the very basic ones in core, install DB (with it's additional >> requirements, including core, DBI, and so on). Same with Graphics, >> Tools, Tree/Phylo, etc. We just need to define and limit the number >> of splits. > > The same can be achieved with CPAN bundles for each kind of functional > grouping you can think of. And since its just a single text file that > defines such a grouping, its easy to change or add new ones as you feel > like it, as opposed to the rather more permanent and substantial effort > of creating one of your splits on the code-base level. > > Also, the world doesn't have to rely on /our/ ideas of what a useful > functional split is. If someone just wants to parse Blast results, they > can just use CPAN to install Bio::SearchIO::blast_pull instead of having > to install all of SearchIO. > > >> - Easier to add additional bundled modules. For instance, I could >> focus all of my RNA work into a discrete set of modules (say, bioperl- >> rna) which I maintain, I ensure works with the latest core code, I >> ensure also plays well with the other children =) , and I distribute >> via CPAN. Same with EUtilities, which could go into a separated DB- >> related set or stay in core. > > And if you lose interest in them? They eventually die because they no > longer have someone looking after them by default (the pumpkin and other > devs). Alternatively you could just make a CPAN bundle. One text file! > Easy! No duplication of modules in CPAN, no new hassle for you or the > Bioperl 'core' pumpkin to ensure that the latest version of each work > with each other and other splits. > > >> - If we want a full-fledged 'install everything', the CPAN Bundle >> system is available. I think it's easier to use a Bundle for 4-5, >> even 10 groups of modules as opposed to over 900. > > No, it isn't any easier. Its /equally/ easy to install a bundle of 900 > packages of 900 modules as it is to install 5 packages of 900 modules. > > When not installing absolutely everything, but perhaps 'most' things, > there's the additional benefit that it would be easier to skip a > particular Bio::module because you didn't want to install its external > dependencies and weren't that interested in it anyway. > > >> - A Bundle or a build file where discrete distributions are listed >> (Bio::SearchIO, etc) wouldn't need to be updated every time a new >> module is added to a distribution. I suppose this could be >> automated, but why have the additional headache? > > Yes, it would be automated, and no, it wouldn't at all be any kind of > additional headache. I'm proposing a fully-automated system that the > pumpkin wouldn't even have to think about it. Much /less/ of a headache > than dealing with splits. Orders of magnitude easier to deal with. > > >> - A chance to cut out some cruft. We all know that particular areas >> need work or a complete overhaul (Restriction, Structure, maybe a few >> others). Smaller, concentrated sets of modules I believe would be >> easier to maintain, and those that don't get use will eventually fall >> out of favor and may be lost or replaced from the more maintained >> group of modules. Survival of the fittest. > > And the smallest, most concentrated set of modules is the individual > module. > > >> - We already have had practice; bioperl-db, bioperl-run, bioperl- >> network, and others. Those that have been routinely maintained and >> enjoy wide use (db, run, network) have survived; others not so much >> (corba-related stuff, microarray, ext, etc., though the code is still >> available if someone else wants to take it up and revive it!). > > The reason some of these existing splits (micoarray, ext) have fallen by > the way-side? /Because/ they're splits. If they had been part of > bioperl-live all along, they'd have been kept in a working, compatible > state and would have been released along with everything else in 1.5.2 > > >> Disadvantages of a defined split: >> >> - The initial headache of identifying which groups go where, >> coordinating with those who rely on bioperl (GMOD, etc) on how this >> will be set up, so on... > > No need to worry about this with individual modules. > > >> - Separate groups of modules require testing together to ensure >> functionality is consistent and maintained (something I think you >> pointed out previously). > > No need to worry. > > >> - I think an increased possibility of branching is possible. >> >> - Extra headaches for devs, who have to keep track of the various >> critical distributions and make sure they work well together. > > No headaches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r r/BykCKbM9lqJM0khARuEms= =NB4B -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Thu Jun 28 08:51:19 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 09:51:19 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837687.7010101@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Charles Plessy wrote: > Dear developpers, > > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? > > Have a nice day, > bioperl-live really means the HEAD of the cvs repository so is the most bleeding-edge code available. Version 1.5.* is the developer release, while the 1.4.* is the stable release. However, there have been few updates to the 1.4.* release which means that it is more unstable than the 1.5.* dev release. I think the consensus, was to have more rapid release cycles of the stable branch in future in order to avoid this. I'm sure there are others more qualified to expand/correct me on this if needs e. Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB /fHFyYkqAvcmOSxu4djPll0= =KwVH -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 09:11:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 10:11:39 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <46836FEE.5030203@sheffield.ac.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk> Message-ID: <46837B4B.7060705@sendu.me.uk> Nathan S. Haigh wrote: (Please try and snip more: don't quote whole posts just to reply to certain paragraphs) > Sendu Bala wrote: >> Chris Fields wrote: >> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank >> users should have, he could just call './Build dist Bio::SeqIO::genbank' >> which would generate a new package for Bio::SeqIO::genbank suitable for >> uploading to CPAN. No more long release cycles and having to constantly >> tell people to 'use CVS' to get working Bioperl code. > > However, how would the test suite work out with this? e.g. when someone > installs Bio::SeqIO::genbank they want to have the tests associated with > Bio::SeqIO::genbank to be run. Would there be tests that would be run > redundantly if for example someone installed Bio::SeqIO::genbank and > Bio::SeqIO::fasta? We would want to move to a strict test-script-per-module system. But that's desirable in any case, as it would greatly ease reaching our goal of complete test coverage, and subsequent maintenance of those tests. The genbank test would only run tests specific to genbank parsing, and likewise for fasta. They would both have a dependency on Bio::SeqIO, and if that was also recently updated, it would get installed prior to you installing genbank (and therefor run its own generic SeqIO tests), but wouldn't get installed again (wouldn't run its tests again) when you install fasta afterwards. On the subject of tests, I'm reminded of another benefit of the individual-module approach. Currently if a test fails during a CPAN install, nothing gets installed. Users do one of: # refuse to install at all (strict sys-admins) # cry and give up (newbies) # cry and seek help (newbies who really really need Bioperl) # force install, leaving them in some undefined state because they didn't understand the problems (most remaining users) # force install, happy that the problems are ok (some Bioperl devs) With a bundle of individual modules you would install virtually all Bioperl modules with no problems, and the problems with the remainder would be clear to everyone. No one would need to force install since the tests results would now be meaningful: the thing you're trying to install really isn't going to work if the tests are failing. If you really needed that particular Bioperl module you could then pay particular attention to why its failing (most likely some problem with an external dependency). >>> Would they also be individually distributed? What would you use to >>> tie all the individual modules together? >> >> They would be tied together by a CPAN bundle. You don't have to >> 'explain' anything to the CPAN maintainers because you're not doing >> anything wrong. In fact, you're using it the way you're supposed to. > > Yep. real modules are released as modules, each with their own set of > dependencies. The use CPAN bundles the way there were supposed to be for > - - distributing a set of CPAN modules that make a coherent set of > functionality. You "could" also bundle in other authors modules e.g. > Bio::ASN1::EntrezGene? Any bundle featuring Bio::SeqIO::entrezgene would necessarily include Bio::ASN1::EntrezGene in the bundle. > Hmm, how would module versions be handled? Wouldn't this approach > require each module to have it's own independent version number, which > could then be used for building the dependencies? Each new release of > that module would only bump that module's version number. Yes, that's how it would work. No more global version number. > Bundles can specify the minimum version of a module to be installed, > such that bug fixes to individual modules and be released into CPAN and > would automatically get picked up when installing bundles etc. Yes. > I'm not quite sure how the current stable/dev releases would work. I > assume bug fixes would have to be made on a branch e.g. branch 1.6 and > released to cpan from there. Then when the next stable release is made, > all module versions would be bumped and and released to CPAN. With any > modifications to the content of the bundle to be made. Is it possible to > have a stable and developer release bundles that are able to specify the > minimum stable and developer modules versions respectively? No, the distinction becomes pretty meaningless. We could still do big major releases, but modules wouldn't be version-bumped. The big release would just be an update of the bundle that specifies the latest version of all Bioperl modules. Remember that bundles only specify the minimum version, not the required version: in this brave new world users would end up with the same versions of modules if they installed a 1.8 bundle compared to 1.7 bundle. The only way to get a true snapshot of 1.7 after it was released would be if we took snapshots and archived them, making them available from bioperl.org (or by checking out the 1.7 tag from cvs/svn). I don't see that as a significant problem. You lose the trivial benefit of being able to install old snapshots from CPAN. The people who have a great need to install old snapshots can find their way to bioperl.org no problem. From bix at sendu.me.uk Thu Jun 28 08:50:09 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 09:50:09 +0100 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org> References: <20070628074004.GD6338@kunpuu.plessy.org> Message-ID: <46837641.8050106@sendu.me.uk> Charles Plessy wrote: > I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if > it would make sense to call it "bioperl-live" and distribute it in > parallel with the stable 1.4.0 version, if bioperl-live means "the > current developepr version". > > If I am wrong, can somebody explain me what bioperl-live exactly refers > to ? bioperl-live is the name of the CVS repository containing what is currently considered the 'Core package' or core modules. http://www.bioperl.org/wiki/Using_CVS If you want to call it something to distinguish it from stable, call it 'developer' vs 'stable' or '1.5.2' vs '1.4.0'. To distinguish them both from the other packages, call them 'core' vs 'run' etc. From hlapp at gmx.net Thu Jun 28 10:31:29 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:31:29 -0300 Subject: [Bioperl-l] Splits again In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > [...] Also - the main point I wanted to make - Can I suggest we > spend a > little time discussing what it will take to get a stable release for > the current code as it stands (bioperl-live and bioperl-run)? It > seems like we really need to do this first so that we have a stable > release that can be followed by CVS -> SVN migration, then consider > major changes to the repository structure and release packaging, and > potential deprecation and incorporation of other modules. I agree we need to discuss a path towards 1.6, but I think that should be kept separate from the cvs->svn migration. Otherwise one stalls the other (by stopping people who seem to have the energy and motivation right now to do one but not the other) for no really good reason. > I assume there is no chance that we'd have a 1.6 candidate by BOSC > next month? I'm not sure that's feasible to be happening but if someone steps up it maybe it is. > > Will it be productive to schedule a fair amount of time at BOSC > discussing how to partition out the packages into separate sub- > packages after we've done a successful release rather than trying to > change things right now? I agree. I also don't think that people are partitioning right now (other than the existing partitioning), though maybe I'm mistaken. > [...] > It would probably mean moving Bio::Graphics, Bio::DB::GFF and > Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages > so they could be released more regularly on par with Gbrowse > schedules. Possibly. I'm not fully sure why those modules couldn't also be released more often out of the "main trunk" of modules. In Java/ant, it'd be relatively easy to write build script filters that select the appropriate modules and package them on the fly. I'm not sure whether the build tools for Perl can do that too, though. > Also I think someone needs to figure out Bio::Tools::GFF > vs Bio::FeatureIO -- what do we want to do? I believe FeatureIO has the ontology download tied into it? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Jun 28 10:47:39 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 28 Jun 2007 07:47:39 -0300 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote: > As I tried to ask for in the past, would someone also illustrate the > importance of why _WE_ need to switch to SVN on a wiki page on > Bioperl so that when someone complains/asks about this in the future > the arguments are already laid out. I am basically fine with it, but > I don't honestly see a compelling reason beyond what has been > mentioned wrt better integration in IDEs. > http://bioperl.org/wiki/Why_SVN I guess at the end of the day svn is just the system of choice for new developers. I've had people tell me who started with svn that cvs seems a lot harder to use. The newer projects are all on svn and for example to integrate Bio::Phylo into BioPerl should become a question of the revision control system. At the end of the day if being on svn makes it easier for new people to contribute it's enough of an argument for me, whether it's rational or not. IMHO, there's two advantages that svn has over cvs. First, directories are versioned, have properties, and generally are the same class of citizens as files. They can be added, renamed, and removed from the repository. In cvs, we all know what a hassle it is to rename or even retire directories. Second, svn log gives you the commits, i.e., the set of changes that constituted one particular commit (and therefore version increase). In cvs that's hard or impossible to reconstruct. Bottom line - I don't think many people if any will question why we moved from cvs to svn ... My $0.02 ... -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hartzell at alerce.com Thu Jun 28 00:34:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Wed, 27 Jun 2007 20:34:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> References: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu> <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net> <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu> Message-ID: <18051.541.684705.567954@almost.alerce.com> Chris Fields writes: > We should port them all, yes. > > chris > > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote: > > > Is there a reason not to port every subproject over? > > > > -hilmar They're all there. At least everything that I found in the CVS repo. Some of the directories were empty, some had very little content, I was just mechanical about it. Here's what I have: [hartzell at dev ~]$ svn ls file://`pwd`/bioperl biodata/ bioperl-cookbook/ bioperl-corba-client/ bioperl-corba-server/ bioperl-das-client/ bioperl-db/ bioperl-ext/ bioperl-gui/ bioperl-live/ bioperl-microarray/ bioperl-network/ bioperl-papers/ bioperl-pedigree/ bioperl-pipeline/ bioperl-run/ biosql-schema/ html/ task-manager/ xml-html/ I wasn't very clear in my original request, but I was hoping that someone out there who's familiar with the various out-of-the-way bits and pieces could take a look at them. I was afraid that everyone was just checking out bioperl-live and doing 'make test'. Someone (chris?) made a point about binary files in bioperl-run. It'd be great if someone in the know could check on them. Also, to the degree that it's possible, look around at various tags and branches and see if they're what you'd expect. Thanks! g. From bix at sendu.me.uk Thu Jun 28 12:21:37 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 13:21:37 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18049.30026.61328.134490@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> Message-ID: <4683A7D1.8070403@sendu.me.uk> George Hartzell wrote: > Chris Fields writes: > > [...] > > It looks like George Hartzell may be taking a crack at it, with > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > could have something testable relatively soon. After that we'll need > > to work out a few other issues, basically what's on Hilmar's list. > > There's a repository on file:///home/hartzell/bioperl with all of the > components projects in place. > > If you have a dev.open-bio.org account and you're in the bioperl > group, you're good to get at it via: > > file:///home/hartzell/bioperl I'm confused. Presumably that only works whilst logged into dev.open-bio.org? > svn+ssh://dev.open-bio.org/home/hartzell/bioperl I just tried: svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl on Mac OS X and things seemed to go well, except for this error message at the end: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory I also ended up with only: bioperl-corba-server bioperl-db bioperl-live bioperl-network bioperl-papers biosql-schema Am I doing something totally wrong here? From hartzell at alerce.com Thu Jun 28 12:32:36 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:32:36 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.43620.481558.447399@almost.alerce.com> Jason Stajich writes: > [...] > The repository machine (dev) is a locked down machine meaning it only > really runs ssh and not many servers include httpd. We have > anonymous CVS (client and through httpd browsing) running on a > separate machine (code) that has the info rsynced over every 10 or 15 > minutes. A great way to provide a read-only mirror of the repos. for anonymous users is to have svnsync running out of cron on code.open-bio.org, configured to pull from the dev.open-bio.org repository. It might actually work to have rsync mirror the fsfs-backed repository, but that's scary-poking-into-the-internals. g. From hartzell at alerce.com Thu Jun 28 12:43:37 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:43:37 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <18051.44281.831316.749586@almost.alerce.com> David Messina writes: > > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote: > > > > > On Jun 27, 2007, at 1:27 PM, David Messina wrote: > > > >> I would think we would want "Author Date Id Rev URL" set on > >> everything, no?. So either cvs2svn or your tool (whichever you think > >> is better), followed by > >> > >> svn propset svn:keywords "Author Date Id Rev URL" * > > > > Shouldn't this be done recursively? > > > Yep, good catch! Thanks, Hilmar. > > Should be: > > svn propset --recursive svn:keywords "Author Date Id Rev URL" * That's not quite what you want either. It'll set the the keyword property on all of the files, including things where you probably don't want expansion to happen (e.g. images, someone said there are binary wads in bioperl-run, etc...). The Right Thing To Do is to grub around (grep) for '\$Id:' (and the others) and set svn:keywords to files that are already using keywords. I have a bourne shell hack that'll do this, although it's painful because it has to run in working directories.... Once we settle on a list of keywords to use, I'll take a wack at the demo repository. Likewise, you probably DON'T want to use this in your config file: enable-auto-props = yes * = svn:keywords="Author Date Id Rev URL" since it'll do the same thing. The Right Thing To Do is a more tedious *.pl = svn:keywords="Author Date Id Rev URL" *.pm = svn:keywords="Author Date Id Rev URL" *.c = svn:keywords="Author Date Id Rev URL" A bit of googling will give you a good starting point for the list, and we should probably maintain a common one somewhere in the repo. I don't think that there's a server side way of doing this, short of running some script via a hook around commit time. g. From hartzell at alerce.com Thu Jun 28 12:54:40 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 08:54:40 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca> <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu> <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu> <18051.1253.87485.235496@almost.alerce.com> Message-ID: <18051.44944.982207.37624@almost.alerce.com> Hilmar Lapp writes: > [...] > IMHO, there's two advantages that svn has over cvs. First, > directories are versioned, have properties, and generally are the > same class of citizens as files. They can be added, renamed, and > removed from the repository. In cvs, we all know what a hassle it is > to rename or even retire directories. Second, svn log gives you the > commits, i.e., the set of changes that constituted one particular > commit (and therefore version increase). In cvs that's hard or > impossible to reconstruct. Two more: - svn groups changes into revisions, so that they can be considered together, CVS versions individual files. - subversion tracks renames/moves correctly, - subversion commits are atomic, so you never have to worry about all of your stuff making it into the repos. at the same time [if you've never had to un-muck this, count yourself blessed!] , - svk, which allows disconnected development while still commiting your work to a repo at natural points along the way (you can revert, branch, etc.... to your hearts content). [yeah, that's 3, err, 4. Math is hard.] g. From cjfields at uiuc.edu Thu Jun 28 13:07:24 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:07:24 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org> <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net> Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu> On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote: > > On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote: > >> ...It >> seems like we really need to do this first so that we have a stable >> release that can be followed by CVS -> SVN migration, then consider >> major changes to the repository structure and release packaging, and >> potential deprecation and incorporation of other modules. > > I agree we need to discuss a path towards 1.6, but I think that > should be kept separate from the cvs->svn migration. Otherwise one > stalls the other (by stopping people who seem to have the energy and > motivation right now to do one but not the other) for no really good > reason. It's good to discuss it as long as it doesn't take time and energy away from other priorities. >> I assume there is no chance that we'd have a 1.6 candidate by BOSC >> next month? > > I'm not sure that's feasible to be happening but if someone steps up > it maybe it is. Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after. Then maybe work on partitioning if everyone's up for it and a scheme is worked out. >> Will it be productive to schedule a fair amount of time at BOSC >> discussing how to partition out the packages into separate sub- >> packages after we've done a successful release rather than trying to >> change things right now? > > I agree. I also don't think that people are partitioning right now > (other than the existing partitioning), though maybe I'm mistaken. The original proposal was based on Steve's idea of splitting up core. I don't think a partition is feasible at this point, at least until we put more thought into it (our energy should be focused elsewhere), but it's well worth discussing as a future path. At this time there are two proposals: 1) Steve's and my 'split into discrete sections' proposal, where we split core into self-sustaining sections with a common core listed as a dependency, tying installation of all together with a Bundle or similar. 2) Sendu's 'break everything up' approach where all modules are submitted independently to CPAN, with their own tests, dependencies, etc. There are advantages and disadvantages to both approaches. Not sure if CPAN would go for the latter (it's pretty drastic), but I don't know for sure. If you want in on that discussion (in this thread) feel free to join in! The more the merrier! >> [...] >> It would probably mean moving Bio::Graphics, Bio::DB::GFF and >> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages >> so they could be released more regularly on par with Gbrowse >> schedules. > > Possibly. I'm not fully sure why those modules couldn't also be > released more often out of the "main trunk" of modules. In Java/ant, > it'd be relatively easy to write build script filters that select the > appropriate modules and package them on the fly. I'm not sure whether > the build tools for Perl can do that too, though. Both approaches above would probably use Module::Build to install other bioperl dependencies, each of which could have it's own dependency set, possibly using a Bundle to tie everything together. >> Also I think someone needs to figure out Bio::Tools::GFF >> vs Bio::FeatureIO -- what do we want to do? > > I believe FeatureIO has the ontology download tied into it? > > -hilmar From recent posts here and on the gbrowse mail list by Scott and Lincoln, it seemed like they were moving away from using Bio::DB::GFF and were trying to get users to switch to Bio::DB::SeqFeature. Maybe should get a more direct response? chris From hartzell at alerce.com Thu Jun 28 13:16:18 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:16:18 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.46242.942184.758493@almost.alerce.com> Sendu Bala writes: > George Hartzell wrote: > > Chris Fields writes: > > > [...] > > > It looks like George Hartzell may be taking a crack at it, with > > > Rutger Vos, Nathan Haigh, and moi helping out where needed. If so we > > > could have something testable relatively soon. After that we'll need > > > to work out a few other issues, basically what's on Hilmar's list. > > > > There's a repository on file:///home/hartzell/bioperl with all of the > > components projects in place. > > > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, that only works if you're actually on the machine. > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? It looks like you tried to check out the *entire* repository. It never occured to me to try that. I'll take a look at what you reported. g. From bix at sendu.me.uk Thu Jun 28 13:20:19 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:20:19 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.46242.942184.758493@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> Message-ID: <4683B593.3050108@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: >> I just tried: >> >> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl [snip] > It looks like you tried to check out the *entire* repository. Yes. If you don't want everything, how does one 'browse' the repository to find out the address of the thing you /do/ want? > It never occured to me to try that. I'll take a look at what you > reported. Cheers. From bix at sendu.me.uk Thu Jun 28 13:27:29 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 14:27:29 +0100 Subject: [Bioperl-l] SVN and ...Re: Perltidy In-Reply-To: <18049.22260.967524.353173@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.22260.967524.353173@almost.alerce.com> Message-ID: <4683B741.5020600@sendu.me.uk> George Hartzell wrote: > There don't seem to be any .cvsignore files in the repository, or in > CVSROOT/cvsignore. > > Am I missing something, or don't we use them? It would be great to have the following files svn:ignored : In all package roots: ? Build ? MANIFEST ? MANIFEST.SKIP ? META.yml ? _build ? bioperl-*.tar.bz2 ? bioperl-*.tar.gz ? bioperl-*.zip ? blib ? cover_db In any and all directories: ? .DS_Store ? .DAV In bioperl-live: ? t/BioDBSeqFeature.t ? t/BioDBSeqFeature_BDB.t ? t/BioDBSeqFeature_mysql.t Can't think of anything else right now. Thanks for your efforts, Sendu. From cjfields at uiuc.edu Thu Jun 28 13:30:43 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 08:30:43 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote: >> ... >> file:///home/hartzell/bioperl > > I'm confused. Presumably that only works whilst logged into > dev.open-bio.org? Yes, it's just a tester. >> svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/trunk /mybiodir' to check out the main trunk for core. chris From hartzell at alerce.com Thu Jun 28 13:57:00 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 09:57:00 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683A7D1.8070403@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> Message-ID: <18051.48684.996884.134046@almost.alerce.com> Sendu Bala writes: > [...] > I just tried: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > on Mac OS X and things seemed to go well, except for this error message > at the end: > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > svn: Can't move source to dest > svn: Can't move > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > to > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > No such file or directory > > I also ended up with only: > bioperl-corba-server bioperl-db bioperl-live > bioperl-network bioperl-papers biosql-schema > > > Am I doing something totally wrong here? So, you probably wanted something like svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk to pick up the head of the bioperl live tree (or /.../bioperl-run/trunk, etc...). I just checked out svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ and it ran to completion and gave me (delicious)[6:50am]~/tmp>>ls bioperl | cat biodata bioperl-cookbook bioperl-corba-client bioperl-corba-server bioperl-das-client bioperl-db bioperl-ext bioperl-gui bioperl-live bioperl-microarray bioperl-network bioperl-papers bioperl-pedigree bioperl-pipeline bioperl-run biosql-schema html task-manager xml-html Can another mac os x user out there give the Great Big Checkout a try and see if it runs to completion. Potential problems that come to mind are: - the "mac's are case insensitive, sort of" problem - you filled up your disk - something else. g. From charles-listes+bioperl at plessy.org Thu Jun 28 13:44:56 2007 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Thu, 28 Jun 2007 22:44:56 +0900 Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ? In-Reply-To: <46837687.7010101@sheffield.ac.uk> References: <20070628074004.GD6338@kunpuu.plessy.org> <46837687.7010101@sheffield.ac.uk> Message-ID: <20070628134456.GB14492@kunpuu.plessy.org> Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit : > > Version 1.5.* is the developer release, while the 1.4.* is the stable > release. However, there have been few updates to the 1.4.* release which > means that it is more unstable than the 1.5.* dev release. I think the > consensus, was to have more rapid release cycles of the stable branch in > future in order to avoid this. I'm sure there are others more qualified > to expand/correct me on this if needs e. Ok, thank you all for the answers. I think that I will simply upgrade bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core when I will package other components. Have a nice day, -- Charles Plessy Debian-Med packaging team Wako, Saitama, Japan From bix at sendu.me.uk Thu Jun 28 14:19:49 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 15:19:49 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.48684.996884.134046@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> Message-ID: <4683C385.3050904@sendu.me.uk> George Hartzell wrote: > Sendu Bala writes: > > [...] > > I just tried: > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > > > on Mac OS X and things seemed to go well, except for this error message > > at the end: > > > > > > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' > > svn: Can't move source to dest > > svn: Can't move > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' > > to > > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': > > No such file or directory > > > > I also ended up with only: > > bioperl-corba-server bioperl-db bioperl-live > > bioperl-network bioperl-papers biosql-schema I tried again in the same location and it told me I had to 'svn cleanup', which I did. But subsequently it kept complaining about files already being there. > I just checked out > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ > > and it ran to completion [snip] > Can another mac os x user out there give the Great Big Checkout a try > and see if it runs to completion. Potential problems that come to > mind are: > > - the "mac's are case insensitive, sort of" problem > - you filled up your disk > - something else. Well, I didn't run out of disc space. After a rm -fr * and trying again it failed at exactly the same point, in the same way. svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data causes this repeatable problem: [...] A data/phredfile.phd svn: In directory 'data' svn: Can't move source to dest svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory That is with Mac OS X svn command-line client, version 1.4.4 I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with a linux svn command-line client, version 1.2.3. Cheers, Sendu. From dmessina at wustl.edu Thu Jun 28 15:08:59 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:08:59 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.44281.831316.749586@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: > [George] > Likewise, you probably DON'T want to use this in your config file: > > enable-auto-props = yes > * = svn:keywords="Author Date Id Rev URL" > > since it'll do the same thing. Ah, so I've been doing it wrong all along then. :) Thanks, George! > The Right Thing To Do is a more tedious > > *.pl = svn:keywords="Author Date Id Rev URL" > *.pm = svn:keywords="Author Date Id Rev URL" > *.c = svn:keywords="Author Date Id Rev URL" > > A bit of googling will give you a good starting point for the list, > and we should probably maintain a common one somewhere in the repo. I've googled around and gathered the following as a possible list for our repo. Since I obviously don't know what I'm doing :), of course adjust and refine as necessary. Dave ------- [auto-props] # Code formats *.c = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cpp = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.h = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.java = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.as = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/plain *.cgi = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn-mine-type=text/plain *.js = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/javascript *.php = svn:eol-style=native; svn:keywords="Author Date Id Rev URL" Rev Date; svn:mime-type=text/x-php *.pl = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl; svn:executable *.pm = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-perl *.py = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-python; svn:executable *.sh = svn:eol-style=native; svn:keywords="Author Date Id Rev URL"; svn:mime-type=text/x-sh; svn:executable # Image formats *.bmp = svn:mime-type=image/bmp *.gif = svn:mime-type=image/gif *.ico = svn:mime-type=image/ico *.jpeg = svn:mime-type=image/jpeg *.jpg = svn:mime-type=image/jpeg *.png = svn:mime-type=image/png *.tif = svn:mime-type=image/tiff *.tiff = svn:mime-type=image/tiff # Data formats *.pdf = svn:mime-type=application/pdf *.avi = svn:mime-type=video/avi *.doc = svn:mime-type=application/msword *.eps = svn:mime-type=application/postscript *.gz = svn:mime-type=application/gzip *.mov = svn:mime-type=video/quicktime *.mp3 = svn:mime-type=audio/mpeg *.ppt = svn:mime-type=application/vnd.ms-powerpoint *.ps = svn:mime-type=application/postscript *.psd = svn:mime-type=application/photoshop *.rtf = svn:mime-type=text/rtf *.swf = svn:mime-type=application/x-shockwave-flash *.tgz = svn:mime-type=application/gzip *.wav = svn:mime-type=audio/wav *.xls = svn:mime-type=application/vnd.ms-excel *.zip = svn:mime-type=application/zip # Text formats .htaccess = svn:mime-type=text/plain *.css = svn:mime-type=text/css *.dtd = svn:mime-type=text/xml *.html = svn:mime-type=text/html *.ini = svn:mime-type=text/plain *.sql = svn:mime-type=text/x-sql *.txt = svn:mime-type=text/plain *.xhtml = svn:mime-type=text/xhtml+xml *.xml = svn:mime-type=text/xml *.xsd = svn:mime-type=text/xml *.xsl = svn:mime-type=text/xml *.xslt = svn:mime-type=text/xml *.xul = svn:mime-type=text/xul *.yml = svn:mime-type=text/plain CHANGES = svn:mime-type=text/plain COPYING = svn:mime-type=text/plain INSTALL = svn:mime-type=text/plain Makefile* = svn:mime-type=text/plain README = svn:mime-type=text/plain TODO = svn:mime-type=text/plain From dmessina at wustl.edu Thu Jun 28 15:11:23 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 10:11:23 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: > [Sendu] > > Yes. If you don't want everything, how does one 'browse' the > repository > to find out the address of the thing you /do/ want? svn ls file://dev.open-bio.org/home/hartzell/bioperl or svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl From n.haigh at sheffield.ac.uk Thu Jun 28 15:13:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:13:58 +0100 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683B593.3050108@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.46242.942184.758493@almost.alerce.com> <4683B593.3050108@sendu.me.uk> Message-ID: <4683D036.5060109@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > George Hartzell wrote: >> Sendu Bala writes: >>> I just tried: >>> >>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl > [snip] >> It looks like you tried to check out the *entire* repository. > > Yes. If you don't want everything, how does one 'browse' the repository > to find out the address of the thing you /do/ want? > You could try: svn ls or svn ls -R to get a list of directories. > >> It never occured to me to try that. I'll take a look at what you >> reported. > > Cheers. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku akLhIszoQbRc/aVX3d/Jp7w= =mlHY -----END PGP SIGNATURE----- From cjfields at uiuc.edu Thu Jun 28 15:20:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:20:46 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> I can replicate the same problem (Mac OS X) with a full checkout: svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data' svn: Can't move source to dest svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory What local (mac) svn version are you using? I'm running off macports: svn --version svn, version 1.4.4 (r25188) compiled Jun 16 2007, 23:40:53 chris On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote: ... > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about > files > already being there. >> > [snip] >> Can another mac os x user out there give the Great Big Checkout a try >> and see if it runs to completion. Potential problems that come to >> mind are: >> >> - the "mac's are case insensitive, sort of" problem >> - you filled up your disk >> - something else. > > Well, I didn't run out of disc space. After a rm -fr * and trying > again > it failed at exactly the same point, in the same way. > > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ > release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or > directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine > with > a linux svn command-line client, version 1.2.3. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Jun 28 15:37:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 10:37:27 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683624F.6020402@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Chris Fields wrote: >> ... > > The short and sweet version: my proposal has all the benefits of > yours, but none of the disadvantages. What's not to like? The short and sweet version: I'm more convinced after you laid out your argument in detail, which would have saved me some typing last night, BTW, thanks! ; > The other core devs need to chip in and we need to openly (candidly) discuss it some more (I've added Hilmar to this). There is also a tenable solution that allows both aspects ('cliques' and single mode) which might make everybody happy. Let's say we only want to install Bio::SeqIO::genbank. The Bio::SeqIO::genbank Build.PL would only install what was needed (as you indicated), only Bio::SeqIO::genbank-related tests would run (along with dependency test, if available), and life would go on. However, what if we wanted to install everything in SeqIO/DB/AlignIO/ etc? We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO modules installed or a select few (maybe a quick 'install all (y/n)?' followed by a list, which installs them one at a time along with dependencies), or have the option to specifically denote them as passed args to SeqIO's Build.PL, something like 'perl Build.PL - install-plugins genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a specific module (Bio::SeqIO::genbank) is installed directly then maybe the installation q&a's of followed modules could be bypassed when installing down the dependency tree with additional passed args. This would, in effect, be a bioperl-specific mini-CPAN within CPAN. Nice! Now, this doesn't address several related issues, such as how we handle versioning of the independent modules (should be in a controlled manner), what we do about deprecated modules which linger about on CPAN, how we deal with PPMs/RPMs/packaging, and so on. All have possible reasonable ways they can be addressed, I believe. Also, I think we should still think about doing regular full-scale 'stable' (1.#) releases (sort of our stamp of approval for that batch of modules at that point in time, with a reasonable 'sell-by' date). Again, it should be seriously discussed among the core devs and the bioperl community at large prior to any serious work on it, and it would be quite a large-scale project, but possibly worth it. It can only go forward if there is enough momentum behind it. >> Finally, all of this should wait until later. Much later, like >> after a decent release, after svn, etc kind of 'later'. I think >> we can agree on that. > > Hmm, not really. If it can be implemented by a change in just > Build.PL and ModuleBuildBioperl, its really independent of > everything else. That's the beauty of it: the only thing that > changes is how things are uploaded to and downloaded from CPAN. The > only person that normally deals with that issue is the pumpkin for > a release, and he only cares about it at release time. > > In fact, if we're going to do it at all it makes sense to try it > out on a minor release like 1.5.3. We've already got experience of > doing it split-style from 1.5.2. (And let me tell you: splits at > the code-base level suck.) BOSC is coming up, and I would like to focus on getting svn migration taken care of ASAP (which is sounding more and more like we plan on moving all open-bio over, unless I misread Jason's post?) and stomping of bugs (my next priority after EUtilities). Maybe in the interim we should try focusing on bug squashing, get out a quick standard dev release (1.5.3) before BOSC, and then a few of us could all communicate there via email/text/IM/phone off-list? Maybe post updates via the bioperl blog and list? > And where is the harm in letting them do it via CPAN as well? In > fact, there are significant benefits: ... I'm already pretty convinced... > The same can be achieved with CPAN bundles for each kind of > functional grouping you can think of. And since its just a single > text file that defines such a grouping, its easy to change or add > new ones as you feel like it, as opposed to the rather more > permanent and substantial effort of creating one of your splits on > the code-base level. ... or it could be run right in Module::Build for specific parent classes (as I mention above). Bundling could be instituted for something like a standard GBrowse release (Bundle::BioPerl::GBrowse) where the functionality might be more spread out (Bio::DB*, Bio::Graphics, Bio::FeatureIO, etc). For a full-scale old-style core install, another Bundle (Bundle::BioPerl::Standard). ... > Yes, it would be automated, and no, it wouldn't at all be any kind > of additional headache. I'm proposing a fully-automated system that > the pumpkin wouldn't even have to think about it. Much /less/ of a > headache than dealing with splits. Orders of magnitude easier to > deal with. The 'headache' would be the initial setup (splitting test, individual Build.PL, etc), but this could be done stepwise or section-wise, I suppose. ... > And the smallest, most concentrated set of modules is the > individual module. Well, only if it runs correctly (i.e. has the entire dep. tree installed). But the 'follow' tests would handle that. > The reason some of these existing splits (micoarray, ext) have > fallen by the way-side? /Because/ they're splits. If they had been > part of bioperl-live all along, they'd have been kept in a working, > compatible state and would have been released along with everything > else in 1.5.2 microarray fell out of favor for other reasons (much faster ways to do the same thing via R), though I think it still could be salvaged if someone wanted to take it up. the other bioperl distros (network, db, run, etc) would also necessitate following the same path as core, but I guess they could be bundled as well. > ... > No headaches. I already have one, sorry! chris From n.haigh at sheffield.ac.uk Thu Jun 28 15:53:52 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 28 Jun 2007 16:53:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683D990.8090909@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> ... >> >> The short and sweet version: my proposal has all the benefits of >> yours, but none of the disadvantages. What's not to like? > > The short and sweet version: I'm more convinced after you laid out your > argument in detail, which would have saved me some typing last night, > BTW, thanks! ; > > > The other core devs need to chip in and we need to openly (candidly) > discuss it some more (I've added Hilmar to this). There is also a > tenable solution that allows both aspects ('cliques' and single mode) > which might make everybody happy. Couldn't "cliques" simply be satisfied with CPAN Bundles? > > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? I think this might be where Bundles come in for installing these "cliques" of related modules? - -- snip -- > >> Yes, it would be automated, and no, it wouldn't at all be any kind of >> additional headache. I'm proposing a fully-automated system that the >> pumpkin wouldn't even have to think about it. Much /less/ of a >> headache than dealing with splits. Orders of magnitude easier to deal >> with. > > The 'headache' would be the initial setup (splitting test, individual > Build.PL, etc), but this could be done stepwise or section-wise, I suppose. Yes, I think this is where most of the labour will be. However, setting the test suite up like this would be beneficial with or without publishing modules individually. - -- snip -- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg UKE/Q/wA3gu1Gb7S6rarCQw= =WQdY -----END PGP SIGNATURE----- From bix at sendu.me.uk Thu Jun 28 16:03:54 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 17:03:54 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4683DBEA.90005@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote: > Let's say we only want to install Bio::SeqIO::genbank. The > Bio::SeqIO::genbank Build.PL would only install what was needed (as you > indicated), only Bio::SeqIO::genbank-related tests would run (along with > dependency test, if available), and life would go on. However, what if > we wanted to install everything in SeqIO/DB/AlignIO/etc? > > We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO > modules installed or a select few (maybe a quick 'install all (y/n)?' > followed by a list, which installs them one at a time along with > dependencies), or have the option to specifically denote them as passed > args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins > genbank embl swiss', 'perl Build.PL -install-plugins all', etc. If a > specific module (Bio::SeqIO::genbank) is installed directly then maybe > the installation q&a's of followed modules could be bypassed when > installing down the dependency tree with additional passed args. I'd probably stay away from something like this. My primary reason being, off-the-top-of-my-head I don't see how to get it to work. If you're installing Bio::SeqIO for the first time via CPAN you can't ask it to install Bio::SeqIO::genbank et al. at the same time because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity. I also wouldn't want these things to be complicated. There should be little in the way of questions to ask during install. Each module's Build.PL should be ultra-simple with no advanced logic at all. It should just specify things that are absolute requirements. This simplicity helps avoid some of the problems we face by distributing the monolithic Bioperl. No, much better for us and for users to provide a Bundle::Bio-SeqIO. > Now, this doesn't address several related issues, such as how we handle > versioning of the independent modules (should be in a controlled > manner), When a module is changed, it gets a version bump. Nothing complicated needs to be done. Transparent and obvious, behaving like all other CPAN modules would be my choice. > what we do about deprecated modules which linger about on CPAN, Delete them from CPAN seems appropriate. > how we deal with PPMs/RPMs/packaging, and so on. All have possible > reasonable ways they can be addressed, I believe. Also, I think we > should still think about doing regular full-scale 'stable' (1.#) > releases (sort of our stamp of approval for that batch of modules at > that point in time, with a reasonable 'sell-by' date). Yes, we can still choose to take a snapshot and announce it to the world, but at the module-level nothing special would happen. There would just be an updated Bundle::Bioperl-everything (or whatever). > Again, it should be seriously discussed among the core devs and the > bioperl community at large prior to any serious work on it, and it would > be quite a large-scale project, but possibly worth it. It can only go > forward if there is enough momentum behind it. The requirement for this approach is per-module test scripts. Which as I identified already, is very desirable anyway so we can hit 100% test coverage. So, regardless of anything else can we all agree that per-module test scripts are a good idea and should be worked on? If so, I'll look into the feasibility and figure out how much work will be involved. From cjfields at uiuc.edu Thu Jun 28 17:17:50 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 12:17:50 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <4683DBEA.90005@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > ... > I'd probably stay away from something like this. My primary reason > being, off-the-top-of-my-head I don't see how to get it to work. If > you're installing Bio::SeqIO for the first time via CPAN you can't > ask it to install Bio::SeqIO::genbank et al. at the same time > because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some > circularity. True... > I also wouldn't want these things to be complicated. There should > be little in the way of questions to ask during install. Each > module's Build.PL should be ultra-simple with no advanced logic at > all. It should just specify things that are absolute requirements. > This simplicity helps avoid some of the problems we face by > distributing the monolithic Bioperl. > > No, much better for us and for users to provide a Bundle::Bio-SeqIO. I just don't want too much Bundle-itis as it'll gets confusing for newbie (i.e. Vista-itis, or AdobeCS-itis). It should be limited to functional grouping (SeqIO, AlignIO, DB, etc), 'install everything', or distribution-specific (GBrowse). I also think (though Hilmar may veto this) that we should work on integrating bioperl-db, network, etc. into this if it goes forward. Here's a question: how do we plan on handling uploading bioperl updates to CPAN via PAUSE? Do we want to run every single module through one pumpkin? Or do we want to have a core dev group PAUSE account? I can see, for instance, removing everything EUtilities- related and submitting it independently using my own PAUSE account, but it would be nice to have it under an umbrella 'bioperl-devs' account instead. >> Now, this doesn't address several related issues, such as how we >> handle versioning of the independent modules (should be in a >> controlled manner), > > When a module is changed, it gets a version bump. Nothing > complicated needs to be done. Transparent and obvious, behaving > like all other CPAN modules would be my choice. > >> what we do about deprecated modules which linger about on CPAN, > > Delete them from CPAN seems appropriate. I know you can do that via PAUSE, but I think it lingers about on search.cpan.org (unless that's been fixed). This would prob. have to be used sparingly. >> how we deal with PPMs/RPMs/packaging, and so on. All have >> possible reasonable ways they can be addressed, I believe. Also, >> I think we should still think about doing regular full-scale >> 'stable' (1.#) releases (sort of our stamp of approval for that >> batch of modules at that point in time, with a reasonable 'sell- >> by' date). > > Yes, we can still choose to take a snapshot and announce it to the > world, but at the module-level nothing special would happen. There > would just be an updated Bundle::Bioperl-everything (or whatever). Right, it would basically be a stamp of certification. >> Again, it should be seriously discussed among the core devs and >> the bioperl community at large prior to any serious work on it, >> and it would be quite a large-scale project, but possibly worth >> it. It can only go forward if there is enough momentum behind it. > > The requirement for this approach is per-module test scripts. Which > as I identified already, is very desirable anyway so we can hit > 100% test coverage. > > So, regardless of anything else can we all agree that per-module > test scripts are a good idea and should be worked on? If so, I'll > look into the feasibility and figure out how much work will be > involved. I think so, but the feasibility issue is critical. Do we want cvs/ svn to be divided up into 900 subdirectories (one for each module), or do we want to have a similar directory structure as we have now, but with each module in it's own directory? Or leave everything as is and generate Build.PL on-the-fly (prob. least feasible)? This is where it might be wise to do it piece-meal at first (maybe starting with something somewhat segregated like Bio::Tools), then progress from there. chris From hartzell at alerce.com Thu Jun 28 17:38:48 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 13:38:48 -0400 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> Message-ID: <18051.61992.627473.323346@almost.alerce.com> David Messina writes: > > [George] > > Likewise, you probably DON'T want to use this in your config file: > > > > enable-auto-props = yes > > * = svn:keywords="Author Date Id Rev URL" > > > > since it'll do the same thing. > > Ah, so I've been doing it wrong all along then. :) Thanks, George! It's not *wrong* if it's never done anything to you that you've regretted. The right answer depends on your situation.... > [...] > I've googled around and gathered the following as a possible list for > our repo. Since I obviously don't know what I'm doing :), of course > adjust and refine as necessary. > That's a great starting point. Do you have write access to the wiki? Could you link it off of the instructions for using svn? g. From hartzell at alerce.com Thu Jun 28 18:06:50 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 14:06:50 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <4683C385.3050904@sendu.me.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> Message-ID: <18051.63674.685297.426813@almost.alerce.com> Sendu Bala writes: > [...] > I tried again in the same location and it told me I had to 'svn > cleanup', which I did. But subsequently it kept complaining about files > already being there. You need to do the cleanup because svn exited gracelessly and you needed to help it get back in it's feet. The cleanup doesn't remove the stuff that you did get checked out, so it's still there getting in the way of your new checkout. > [...] > svn co > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data > > causes this repeatable problem: > > [...] > A data/phredfile.phd > svn: In directory 'data' > svn: Can't move source to dest > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory > > That is with Mac OS X svn command-line client, version 1.4.4 > > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with > a linux svn command-line client, version 1.2.3. I'm not 100% sure what's going on here, but I'm inclined to say "get a real computer" (and yes, I'm typing this on a mac...). I have a mac pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony the tiger used to say).... I think that we're having trouble with case sensitivity. My only evidence is that I can see where there have been both HUMBETGLOA.FASTA and HUMBETGLOA.fasta in the tree at various times. I can't figure out anything else that's weird about that file. On the other hand, I can't see how this would cause the error you're seeing though. The experiment would be to grab a usb or firewire disk (or even a memory stick), partition/format it as case sensitive (or even *unix*) and try to do svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data into it. If it works, voila. If not, I'll keep making stuff up, err, thinking about it. g. From dmessina at wustl.edu Thu Jun 28 18:15:32 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:15:32 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu> Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu> Same svn error here on the full checkout. > What local (mac) svn version are you using? I'm running off macports: > > svn --version > svn, version 1.4.4 (r25188) > compiled Jun 16 2007, 23:40:53 I have svn 1.4.3. % svn --version svn, version 1.4.3 (r23084) compiled Apr 1 2007, 02:47:14 Copyright (C) 2000-2006 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ This product includes software developed by CollabNet (http:// www.Collab.Net/). The following repository access (RA) modules are available: * ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' scheme * ra_svn : Module for accessing a repository using the svn network protocol. - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme From cjfields at uiuc.edu Thu Jun 28 18:54:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 13:54:15 -0500 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18051.63674.685297.426813@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > ... > I'm not 100% sure what's going on here, but I'm inclined to say "get a > real computer" (and yes, I'm typing this on a mac...). I have a mac > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > the tiger used to say).... Ouch! Though it could be worse (**coughwindowscough**). > I think that we're having trouble with case sensitivity. My only > evidence is that I can see where there have been both HUMBETGLOA.FASTA > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > anything else that's weird about that file. On the other hand, I > can't see how this would cause the error you're seeing though. Odd that other branches (including the main trunk) work but that one doesn't. > The experiment would be to grab a usb or firewire disk (or even a > memory stick), partition/format it as case sensitive (or even *unix*) > and try to do > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > live/tags/release-0-9-2/t/data > > into it. If it works, voila. If not, I'll keep making stuff up, err, > thinking about it. > > g. I'll have to figure out why I can't get ssh keys to work locally to test it out more (I have a usb drive to test with); just don't have time at the moment. chris From dmessina at wustl.edu Thu Jun 28 18:47:04 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 13:47:04 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu> > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? Done. http://www.bioperl.org/wiki/Svn_auto-props linked from: http://www.bioperl.org/wiki/Using_Subversion (bottom of page) From bix at sendu.me.uk Thu Jun 28 19:19:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 20:19:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> Message-ID: <468409C7.7020102@sendu.me.uk> Chris Fields wrote: > On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: > Here's a question: how do we plan on handling uploading bioperl > updates to CPAN via PAUSE? Do we want to run every single module > through one pumpkin? Or do we want to have a core dev group PAUSE > account? I can see, for instance, removing everything EUtilities- > related and submitting it independently using my own PAUSE account, > but it would be nice to have it under an umbrella 'bioperl-devs' > account instead. All Bioperl modules (except the Bundle!) are owned by BIOPERLML on PAUSE. Its a little akward since PAUSE is uploader-centric, but see my notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release And certainly, everything that wants to consider itself part of Bioperl (and gain the benefit of lots of devs looking after it) should certainly have BIOPERLML as the primary owner. > I think so, but the feasibility issue is critical. Do we want cvs/ > svn to be divided up into 900 subdirectories (one for each module), > or do we want to have a similar directory structure as we have now, > but with each module in it's own directory? Or leave everything as > is and generate Build.PL on-the-fly (prob. least feasible)? Very definitely the latter. The key benefit of my approach is that the organisation stays as is and that a snapshot of the repository remains a single directory of modules in Bio so that people don't have to 'install' Bioperl, they can still just uncompress the archive (or check out the package from svn) and point their PERL5LIB to the root dir of the package. For that reason I very much like the idea of folding the current split-out packages (run, network etc.) back into the core package so everything is one place. Folding them back in should obviously wait until everything is in place and working with core already. My proposal obviously wasn't very clear. As far as all other devs are concerned, nothing changes at all (except for lots of new improved test scripts). The pumpkin will, however, be able to say: ./Build dist Right now that generates the distribution archives (in different compression formats) - one big archive containing everything. My proposal is simply that instead it generates lots of archives, one archive per module. It will also generate some Bundles and whatever else might be needed. I don't envisage any major difficulties in achieving this. The 'feasibility' issue I was going to look into was strictly regarding doing all the new test scripts. From hartzell at alerce.com Thu Jun 28 19:43:38 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 15:43:38 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> Message-ID: <18052.3946.224905.415905@almost.alerce.com> Chris Fields writes: > > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote: > > > ... > > I'm not 100% sure what's going on here, but I'm inclined to say "get a > > real computer" (and yes, I'm typing this on a mac...). I have a mac > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony > > the tiger used to say).... > > Ouch! Though it could be worse (**coughwindowscough**). > > > I think that we're having trouble with case sensitivity. My only > > evidence is that I can see where there have been both HUMBETGLOA.FASTA > > and HUMBETGLOA.fasta in the tree at various times. I can't figure out > > anything else that's weird about that file. On the other hand, I > > can't see how this would cause the error you're seeing though. > > Odd that other branches (including the main trunk) work but that one > doesn't. > > > The experiment would be to grab a usb or firewire disk (or even a > > memory stick), partition/format it as case sensitive (or even *unix*) > > and try to do > > > > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- > > live/tags/release-0-9-2/t/data > > > > into it. If it works, voila. If not, I'll keep making stuff up, err, > > thinking about it. > > > > g. > > I'll have to figure out why I can't get ssh keys to work locally to > test it out more (I have a usb drive to test with); just don't have > time at the moment. I just did the experiment, and filename-insensitivity seems to be breaking something. I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. I reformatted a memory stick to be case sensitive and co of bioperl/bioperl-live/tags/release-0-9-2/t worked, then I made a directory in my home dir (normal mac thing) and got the same error as above. I can get a copy of the trunk, so I'm inclined to ask someone to mention the problem on the wiki and then just ignore it. g. From cjfields at uiuc.edu Thu Jun 28 20:29:09 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 15:29:09 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu> On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote: > Chris Fields wrote: >> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote: >> Here's a question: how do we plan on handling uploading bioperl >> updates to CPAN via PAUSE? Do we want to run every single module >> through one pumpkin? Or do we want to have a core dev group PAUSE >> account? I can see, for instance, removing everything EUtilities- >> related and submitting it independently using my own PAUSE account, >> but it would be nice to have it under an umbrella 'bioperl-devs' >> account instead. > > All Bioperl modules (except the Bundle!) are owned by BIOPERLML on > PAUSE. Its a little akward since PAUSE is uploader-centric, but see my > notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release > > And certainly, everything that wants to consider itself part of > Bioperl > (and gain the benefit of lots of devs looking after it) should > certainly > have BIOPERLML as the primary owner. Alrighty then. >> I think so, but the feasibility issue is critical. Do we want cvs/ >> svn to be divided up into 900 subdirectories (one for each module), >> or do we want to have a similar directory structure as we have now, >> but with each module in it's own directory? Or leave everything as >> is and generate Build.PL on-the-fly (prob. least feasible)? > > Very definitely the latter. The key benefit of my approach is that the > organisation stays as is and that a snapshot of the repository > remains a > single directory of modules in Bio so that people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Okay, makes sense. > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. I agree, but that's up to Brian, Hilmar, and the others who donated the packages (or at least a consensus of core devs). One thing at a time. > My proposal obviously wasn't very clear. As far as all other devs are > concerned, nothing changes at all (except for lots of new improved > test > scripts). The pumpkin will, however, be able to say: > > ./Build dist > > Right now that generates the distribution archives (in different > compression formats) - one big archive containing everything. > My proposal is simply that instead it generates lots of archives, one > archive per module. It will also generate some Bundles and whatever > else > might be needed. We'll need to define which tests and data goes with each module and so on. > I don't envisage any major difficulties in achieving this. The > 'feasibility' issue I was going to look into was strictly regarding > doing all the new test scripts. Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 is ready to go. We'll still need to get thoughts on this from other core devs out there, and it prob. should until everybody is comfortable with the idea. chris From dmessina at wustl.edu Thu Jun 28 22:13:48 2007 From: dmessina at wustl.edu (David Messina) Date: Thu, 28 Jun 2007 17:13:48 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: Coming late to this party, I'm replying to snippets from multiple emails. > [Chris] > what we do about deprecated modules which linger > about on CPAN > [Sendu] > Delete them from CPAN seems appropriate. I coulda sworn this was frowned upon, but a recent thread suggests it's totally kosher. http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html > [Sendu] > So, regardless of anything else can we all agree that per-module test > scripts are a good idea and should be worked on? I agree. > [Sendu] > people don't have to > 'install' Bioperl, they can still just uncompress the archive (or > check > out the package from svn) and point their PERL5LIB to the root dir of > the package. Could you elaborate a bit on how this works? How is XS code that needs compiling handled? Or the scripts directory? I would love to be able to do this. > [Sendu] > For that reason I very much like the idea of folding the current > split-out packages (run, network etc.) back into the core package so > everything is one place. Folding them back in should obviously wait > until everything is in place and working with core already. From an organizational standpoint, I'm concerned that with ~900 modules in core right now, adding all of the additional stuff from the split-out packages would make for a daunting directory. But as you said, this is way down the road, so this proposal doesn't bear on the other, closer-to-now issues on the table. > [Chris] > Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 > is ready to go. We'll still need to get thoughts on this from other > core devs out there, and it prob. should until everybody is > comfortable with the idea. If we go forward with the CPAN split plan, I like the idea of having a trial. We can foresee some of the issues that such a change may bring, and yet still more no doubt wait for us once we do it. Dave From bix at sendu.me.uk Thu Jun 28 22:59:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 28 Jun 2007 23:59:35 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <46843D57.2080409@sendu.me.uk> David Messina wrote: >> people don't have to 'install' Bioperl, they can still just >> uncompress the archive (or check out the package from svn) and >> point their PERL5LIB to the root dir of the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to be > able to do this. I meant for the most part. Core doesn't have any XS code so that's not an issue. Scripts can be run manually like any other perl script. When you discover something isn't working because of a missing external dependency, you just install it. (But that happens very rarely.) Personally I've /never/ installed Bioperl and used that installed set of modules. I've always just pointed my PERL5LIB at the distribution folder or my cvs checkout. Which makes me a strange candidate for advocating all these CPAN-specific changes, but there you go ;) From cjfields at uiuc.edu Thu Jun 28 23:03:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 28 Jun 2007 18:03:02 -0500 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu> On Jun 28, 2007, at 5:13 PM, David Messina wrote: > Coming late to this party, I'm replying to snippets from multiple > emails. > > >> [Chris] >> what we do about deprecated modules which linger >> about on CPAN > >> [Sendu] >> Delete them from CPAN seems appropriate. > > I coulda sworn this was frowned upon, but a recent thread suggests > it's totally kosher. > > http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html As long as it doesn't show up somewhere to confuse newbies I'm okay with it. >> [Sendu] >> people don't have to >> 'install' Bioperl, they can still just uncompress the archive (or >> check >> out the package from svn) and point their PERL5LIB to the root dir of >> the package. > > Could you elaborate a bit on how this works? How is XS code that > needs compiling handled? Or the scripts directory? I would love to > be able to do this. Maybe Sendu can add to this, but the XS code is limited to bioperl- ext AFAIK. We could keep that separate until it plays well with bioperl itself. Scripts and examples - maybe packaged along with a Bundle? >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal > doesn't bear on the other, closer-to-now issues on the table. Well, the code in bioperl-db and network complement code in core, so I agree with Sendu they belong there. They should be under the same scrutiny as the rest anyway (code, tests, etc), but won't be bundled unles there is an 'install everything' Bundle. >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of > having a trial. We can foresee some of the issues that such a > change may bring, and yet still more no doubt wait for us once we > do it. That's what branches are for; testing stuff out like this. chris From hartzell at alerce.com Thu Jun 28 23:05:32 2007 From: hartzell at alerce.com (George Hartzell) Date: Thu, 28 Jun 2007 19:05:32 -0400 Subject: [Bioperl-l] problem with binary files. Message-ID: <18052.16060.932502.183552@almost.alerce.com> Ok, after pointing out the problem with setting the svn:keywords property on binary files, it turns out that I *did* that. Worse yet, I set the svn:eol-style to 'native' on everything, including binary files, so depending on your platform they're likely to be fubar. For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may not be what you expect it to be, depending on whether your eol-style matches the servers and whether any conversions were done. I'll touch up the way that the little tool I'm using calls cvs2svn and redo the repository. g. From n.haigh at sheffield.ac.uk Fri Jun 29 06:59:21 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 07:59:21 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> Message-ID: <4684ADC9.8040404@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- split -- >> [Sendu] >> For that reason I very much like the idea of folding the current >> split-out packages (run, network etc.) back into the core package so >> everything is one place. Folding them back in should obviously wait >> until everything is in place and working with core already. > > From an organizational standpoint, I'm concerned that with ~900 > modules in core right now, adding all of the additional stuff from > the split-out packages would make for a daunting directory. > > But as you said, this is way down the road, so this proposal doesn't > bear on the other, closer-to-now issues on the table. > I don't think this is an issue - it would simply mean everything is under the same version control hierarchy. And with svn it's Soooooo much easier to fiddle around with directory structures > > >> [Chris] >> Okay. Maybe it's worth doing on a branch as a test run when 1.5.3 >> is ready to go. We'll still need to get thoughts on this from other >> core devs out there, and it prob. should until everybody is >> comfortable with the idea. > > If we go forward with the CPAN split plan, I like the idea of having > a trial. We can foresee some of the issues that such a change may > bring, and yet still more no doubt wait for us once we do it. > Under svn it would be easy to make an "svn copy" of run, network etc into a branch of live to test this out. Not that this might be a problem, but: Since we are looking at bioperl-* packages being under the same svn repository, then then "svn copy's" are cheap for disk space. > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6 BCvltmPyWF4ImueYmd7VFAc= =ktl+ -----END PGP SIGNATURE----- From n.haigh at sheffield.ac.uk Fri Jun 29 07:05:33 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 08:05:33 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <18051.61992.627473.323346@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> Message-ID: <4684AF3D.5090907@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Hartzell wrote: - -- snip -- > > [...] > > I've googled around and gathered the following as a possible list for > > our repo. Since I obviously don't know what I'm doing :), of course > > adjust and refine as necessary. > > > > That's a great starting point. Do you have write access to the wiki? > Could you link it off of the instructions for using svn? > > g. Don't .t files need adding to the auto-props? Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC /Iivb6Lc4/51bUdrTmRQYlE= =V+t2 -----END PGP SIGNATURE----- From sac at bioperl.org Fri Jun 29 08:25:36 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 01:25:36 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> On 6/27/07, Chris Fields wrote: > > On Jun 26, 2007, at 3:21 PM, George Hartzell wrote: > > > ... > > If you have a dev.open-bio.org account and you're in the bioperl > > group, you're good to get at it via: > > > > file:///home/hartzell/bioperl > > > > or > > > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl > > I managed to get it working using file://. Haven't tried svn+ssh yet > but I've had persistent problems getting ssh to work properly on my > macbook; not sure why yet but I haven't had time to play around with it. Are you using the ssh that comes installed with OSX? If so, I'd recommend installing openssh from MacPorts. I recall having issues with the stock version which were resolved by using the more up-to-date version you can get via MacPorts. BTW, I haven't been able to check out the new svn repository via svn+ssh:// because I can't get svn to authenticate with an alternative username. My username on dev.open-bio.org differs from what it is on my local machine, so I issue a command such as: steve at localhost $ svn --username sac checkout svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk but I get challenged with: steve at dev.open-bio.org's password: I also tried putting the --username argument after the subcommand, but it still wants to use my local username. I can ssh -l sac into the dev box no problem. Any suggestions? Steve From bix at sendu.me.uk Fri Jun 29 08:52:42 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 29 Jun 2007 09:52:42 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <4684C85A.5030206@sendu.me.uk> Steve Chervitz wrote: > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? Set up your ssh key on the dev machine. I'm also on a machine with the wrong username and it works even without attempting to supply the correct one. It does, however, show the 'Welcome to the new developer system' message 2 or 3 times for every svn+ssh action, which freaks me out a little. From N.Haigh at sheffield.ac.uk Fri Jun 29 09:32:38 2007 From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 29 Jun 2007 10:32:38 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Quoting Steve Chervitz : -- snip -- > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. My username on dev.open-bio.org differs from what it is on > my local machine, so I issue a command such as: > > steve at localhost $ svn --username sac checkout > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > but I get challenged with: > steve at dev.open-bio.org's password: > > I also tried putting the --username argument after the subcommand, but > it still wants to use my local username. I can ssh -l sac into the dev > box no problem. Any suggestions? > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > You could try: svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Nath From dmessina at wustl.edu Fri Jun 29 12:28:26 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 07:28:26 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> > > BTW, I haven't been able to check out the new svn repository via > svn+ssh:// because I can't get svn to authenticate with an alternative > username. I have the same issue. I set up a stanza in my ~/.ssh/config: Host dev.open-bio.org User dave_messina where dave_messina is my dev.open-bio.org username. From cjfields at uiuc.edu Fri Jun 29 17:00:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 29 Jun 2007 12:00:27 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >> BTW, I haven't been able to check out the new svn repository via >> svn+ssh:// because I can't get svn to authenticate with an >> alternative >> username. > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > Host dev.open-bio.org > User dave_messina > > where dave_messina is my dev.open-bio.org username. I changed to the macports ssh w/o luck. It appears the key is offered up, so maybe the problem is how I have everything set up on dev (though I followed everything on the wiki): .... Contact 'support at open-bio.org' for your new login information. ====================================== debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug1: Next authentication method: publickey debug1: Offering public key: /Users/cjfields/.ssh/id_dsa debug2: we sent a publickey packet, wait for reply debug1: Authentications that can continue: publickey,gssapi-with- mic,password debug2: we did not send a packet, disable method debug1: Next authentication method: password It's odd; I can use passwordless logins for other servers (admittedly Mac servers) w/o problems using ssh keys, but dev.open-bio.org always prompts for a password regardless. My feeling is it's something with my local ssh or sshd config; I'll try fiddling with it to see what happens. Anyone have suggestions? I've lost enough hair as is; don't want to lose more! chris From sac at bioperl.org Fri Jun 29 17:07:45 2007 From: sac at bioperl.org (Steve Chervitz) Date: Fri, 29 Jun 2007 10:07:45 -0700 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <1183109558.4684d1b69bcec@webmail.shef.ac.uk> Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com> On 6/29/07, Nathan S. Haigh wrote: > Quoting Steve Chervitz : > > -- snip -- > > > BTW, I haven't been able to check out the new svn repository via > > svn+ssh:// because I can't get svn to authenticate with an alternative > > username. My username on dev.open-bio.org differs from what it is on > > my local machine, so I issue a command such as: > > > > steve at localhost $ svn --username sac checkout > > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk > > > > but I get challenged with: > > steve at dev.open-bio.org's password: > > > > I also tried putting the --username argument after the subcommand, but > > it still wants to use my local username. I can ssh -l sac into the dev > > box no problem. Any suggestions? > > [...] > You could try: > svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk Bingo. Thanks for the tips, guys. BTW, setting up ssh keys was not the issue, since my key is already set up on the dev machine. The svn --username setting appears to not be operative at the ssh layer. I suspected this might be the case given that the usage info says: $ svn --help co --username arg : specify a username ARG --password arg : specify a password ARG which seemed insecure. I didn't want to send my password in the clear, and didn't know if or whether svn would hand it off to ssh. It wasn't even sending my username to ssh, so I knew something was wrong. These args are probably only intended for accessing local svn repositories, or non-svn+ssh-based checkouts. BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and openssh installed via MacPorts: $ svn --version svn, version 1.4.4 (r25188) compiled Jun 28 2007, 23:51:53 $ ssh -version OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007 Steve From hartzell at alerce.com Fri Jun 29 19:19:31 2007 From: hartzell at alerce.com (George Hartzell) Date: Fri, 29 Jun 2007 15:19:31 -0400 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> Message-ID: <18053.23363.102371.602742@almost.alerce.com> Chris Fields writes: > > On Jun 29, 2007, at 7:28 AM, David Messina wrote: > > >> > >> BTW, I haven't been able to check out the new svn repository via > >> svn+ssh:// because I can't get svn to authenticate with an > >> alternative > >> username. > > > > I have the same issue. I set up a stanza in my ~/.ssh/config: > > > > Host dev.open-bio.org > > User dave_messina > > > > where dave_messina is my dev.open-bio.org username. > > I changed to the macports ssh w/o luck. It appears the key is > offered up, so maybe the problem is how I have everything set up on > dev (though I followed everything on the wiki): A couple of things to check. - make sure that you put your public key in ~/.ssh/authorized_keys2 (not authorized_keys) - make sure that authorized_keys2 is chmod'ed 600 (644 might be enough...). - make sure that ~/.ssh is chmoded 700. - make sure that your home directory is 755. Then see if it works. You might be able to relax some of those protections a bit, but ssh's uptight about letting other people mess with that data. g. From dmessina at wustl.edu Fri Jun 29 22:47:14 2007 From: dmessina at wustl.edu (David Messina) Date: Fri, 29 Jun 2007 17:47:14 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> > [Nathan] > Don't .t files need adding to the auto-props? Yes -- thanks for reminding me. Please feel free to add it to the wiki page. I'll be tweaking it some more later on in any case. Dave From n.haigh at sheffield.ac.uk Sat Jun 30 09:55:56 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 10:55:56 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> Message-ID: <468628AC.9060200@sheffield.ac.uk> David Messina wrote: >> [Nathan] >> Don't .t files need adding to the auto-props? > > Yes -- thanks for reminding me. Please feel free to add it to the wiki > page. I'll be tweaking it some more later on in any case. > > > Dave I noticed this has already been done. I have just been through the t/data dir and added a list of extensions I found (without props). There are some files without extensions, how should these be dealt with? There seems to be a plethora of file naming styles which means there's a pretty long list of non-standard extensions. So at some point someone will commit a new data file with a new extension (often describing what program created the output or the test for which it's intended) that won't be in the auto-props file - can you think of a way around this? Nath From cjfields at uiuc.edu Sat Jun 30 12:48:10 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 07:48:10 -0500 Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository] In-Reply-To: <18053.23363.102371.602742@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com> <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu> <18053.23363.102371.602742@almost.alerce.com> Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu> On Jun 29, 2007, at 2:19 PM, George Hartzell wrote: > Chris Fields writes: >> >> On Jun 29, 2007, at 7:28 AM, David Messina wrote: >> >>>> >>>> BTW, I haven't been able to check out the new svn repository via >>>> svn+ssh:// because I can't get svn to authenticate with an >>>> alternative >>>> username. >>> >>> I have the same issue. I set up a stanza in my ~/.ssh/config: >>> >>> Host dev.open-bio.org >>> User dave_messina >>> >>> where dave_messina is my dev.open-bio.org username. >> >> I changed to the macports ssh w/o luck. It appears the key is >> offered up, so maybe the problem is how I have everything set up on >> dev (though I followed everything on the wiki): > > A couple of things to check. > > - make sure that you put your public key in ~/.ssh/authorized_keys2 > (not authorized_keys) > > - make sure that authorized_keys2 is chmod'ed 600 (644 might be > enough...). > > - make sure that ~/.ssh is chmoded 700. > > - make sure that your home directory is 755. > > Then see if it works. You might be able to relax some of those > protections a bit, but ssh's uptight about letting other people mess > with that data. > > g. Got it working; it was the permissions on my home dir (the last one). Thanks George! chris From dmessina at wustl.edu Sat Jun 30 15:37:44 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 10:37:44 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <468628AC.9060200@sheffield.ac.uk> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> > I have just been through the t/data dir and added a list of > extensions I found Thanks! That's a big help. I'll add prop definitions to those shortly. > There are some files without extensions, how should these be dealt > with? If you look in the text files section, there are some files there which don't have extensions, e.g. AUTHORS, BUGS. There's also Makefile.* so we have some flexibility in how svn knows to auto-prop a file. I haven't read up on the details yet to find out how it handles files that match multiple criteria -- it may be dependent simply on the order they're defined. > There seems to be a plethora of file naming styles which means > there's a pretty long list of non-standard extensions. So at some > point someone will commit a new data file with a new extension > (often describing what program created the output or the test for > which it's intended) that won't be in the auto-props file - can you > think of a way around this? Ive been thinking about this a bit. How about this? - We have just "standard" files and extensions (like *.blast, *.fasta) in the auto-props list. - We manually add props for the files that have nonstandard, arbitrary extensions so all the files have now are prop'd. - At some point we rename those nonstandard files to have standard extensions. Especially for the t/data/ files, we'll have to make sure to update the tests that rely on them. - We can have the suggested list of extensions for new files that get added. I don't think we need to strictly enforce this just for the sake of svn (after all, its primary function of version control will work just fine without any properties set), but it would be nice if we could try to keep to it mostly. Many distros come with an /etc/mime.types file which has the list of officially registered MIME types. I found a script that will take this list and convert it into auto-props format. I don't think we need to support *all* of the gazillion filetypes since most of the them our repository will never see, but we certainly could. Dave From dmessina at wustl.edu Sat Jun 30 16:26:27 2007 From: dmessina at wustl.edu (David Messina) Date: Sat, 30 Jun 2007 11:26:27 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 10:37 AM, David Messina wrote: > - We manually add props for the files that have nonstandard, > arbitrary extensions so all the files have now are prop'd. Er, that should be - We manually add props for the files that have nonstandard, arbitrary extensions so that all the files now in the repository are prop'd. From n.haigh at sheffield.ac.uk Sat Jun 30 17:25:58 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sat, 30 Jun 2007 18:25:58 +0100 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: <46869226.70203@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - -- snip -- > > >> There seems to be a plethora of file naming styles which means there's >> a pretty long list of non-standard extensions. So at some point >> someone will commit a new data file with a new extension (often >> describing what program created the output or the test for which it's >> intended) that won't be in the auto-props file - can you think of a >> way around this? > > Ive been thinking about this a bit. How about this? > > - We have just "standard" files and extensions (like *.blast, *.fasta) > in the auto-props list. I think the list of seq formats recognised by Bioperl in Bio::SeqIO and Bio::AlignIO would be a good start. As these are likely to be the ones that are sensitive to file format recognition and thus could break tests if renamed. I think a lot of people have used "." in file names as an alternative to a space. I think it would be beneficial to use an underscore "_" in these cases and leave the "." to represent the beginning of the file extension. > > - We manually add props for the files that have nonstandard, arbitrary > extensions so all the files that we currently have now are prop'd. > > - At some point we rename those nonstandard files to have standard > extensions. Especially for the t/data/ files, we'll have to make sure to > update the tests that rely on them. Nice and easy with svn :) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0 pYVvXwxq0lpiGfM09RQ6A1I= =3Lhw -----END PGP SIGNATURE----- From cjfields at uiuc.edu Sat Jun 30 19:11:52 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 30 Jun 2007 14:11:52 -0500 Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re: Perltidy] In-Reply-To: References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu> <18051.44281.831316.749586@almost.alerce.com> <18051.61992.627473.323346@almost.alerce.com> <4684AF3D.5090907@sheffield.ac.uk> <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu> <468628AC.9060200@sheffield.ac.uk> <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu> Message-ID: On Jun 30, 2007, at 11:26 AM, David Messina wrote: > > On Jun 30, 2007, at 10:37 AM, David Messina wrote: > >> - We manually add props for the files that have nonstandard, >> arbitrary extensions so all the files have now are prop'd. > > Er, that should be > > - We manually add props for the files that have nonstandard, > arbitrary extensions so that all the files now in the repository are > prop'd. Do we need to define every filetype extension, or can there be a fallback (eg if it isn't on the list or has no extension it's plain text)? chris From hlapp at gmx.net Sat Jun 30 21:26:22 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 17:26:22 -0400 Subject: [Bioperl-l] Splits again In-Reply-To: <468409C7.7020102@sendu.me.uk> References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > [...] > Very definitely the latter. The key benefit of my approach is that > the organisation stays as is and that a snapshot of the repository > remains a single directory of modules in Bio so that people don't > have to 'install' Bioperl, they can still just uncompress the > archive (or check out the package from svn) and point their > PERL5LIB to the root dir of the package. I think this is absolutely key to keep in mind. Anything without this feature will likely be a non-starter. I don't really have time to follow the discussion let alone participate, so really all I can contribute is to offer some sanity/ reality checks (such as the above). In this sense, I understand a release pumpkin will generate ~900 packages to upload to CPAN? How much hassle is that compared to what uploading a bioperl release means right now? How brittle is all the Build.PL code that will be needed to automate all of this, and how difficult will it be to maintain? For example, if someone adds in 10 new modules, what Build.PL-related work will need to be done? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Sat Jun 30 21:32:52 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 30 Jun 2007 22:32:52 +0100 Subject: [Bioperl-l] Splits again In-Reply-To: References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk> <46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk> <4682B798.1010409@sheffield.ac.uk> <4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk> <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu> <4682E824.1050507@sendu.me.uk> <4683624F.6020402@sendu.me.uk> <4683DBEA.90005@sendu.me.uk> <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu> <468409C7.7020102@sendu.me.uk> Message-ID: <4686CC04.6000403@sendu.me.uk> Hilmar Lapp wrote: > On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote: > >> [...] >> Very definitely the latter. The key benefit of my approach is that >> the organisation stays as is and that a snapshot of the repository >> remains a single directory of modules in Bio so that people don't >> have to 'install' Bioperl, they can still just uncompress the >> archive (or check out the package from svn) and point their >> PERL5LIB to the root dir of the package. [snip] > In this sense, I understand a release pumpkin will generate ~900 > packages to upload to CPAN? How much hassle is that compared to what > uploading a bioperl release means right now? I'd have to investigate. I did my uploads using the PAUSE website, which for 900 packages would be unfeasible. Will have to see if the process can be automated. > How brittle is all the Build.PL code that will be needed to automate > all of this, and how difficult will it be to maintain? For example, > if someone adds in 10 new modules, what Build.PL-related work will > need to be done? Well, my plan will be that once the work is done, you won't need to touch the Build.PL code again. My intent is that the pumpkin can just type one command and not think about anything. As for the reality, I won't know until I think about it properly and experiment. From hlapp at gmx.net Sat Jun 30 23:36:45 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 30 Jun 2007 19:36:45 -0400 Subject: [Bioperl-l] First cut svn repository In-Reply-To: <18052.3946.224905.415905@almost.alerce.com> References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca> <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net> <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu> <4673C7CB.1030709@mail.nih.gov> <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu> <18049.30026.61328.134490@almost.alerce.com> <4683A7D1.8070403@sendu.me.uk> <18051.48684.996884.134046@almost.alerce.com> <4683C385.3050904@sendu.me.uk> <18051.63674.685297.426813@almost.alerce.com> <18052.3946.224905.415905@almost.alerce.com> Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote: > I just did the experiment, and filename-insensitivity seems to be > breaking something. > > I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/. > > I reformatted a memory stick to be case sensitive and co of > > bioperl/bioperl-live/tags/release-0-9-2/t > > worked, then I made a directory in my home dir (normal mac thing) and > got the same error as above. You picked up a rename of a file from lower case extension to upper case extension. Unfortunately, there are several months between adding the upper-case and removing the lower-case version. We can reconstruct what happened with this using svn log on the directory (this does not require a checkout): $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ bioperl-live/trunk/t/data Searching for HUMBETGLOA yields the following two commits that added one and removed the other: ------------------------------------------------------------------------ r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines Changed paths: M /bioperl-live/trunk/t/SearchIO.t A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA A /bioperl-live/trunk/t/data/cysprot1.FASTA added tests for FASTA ------------------------------------------------------------------------ r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines Changed paths: A /bioperl-live/trunk/t/data/HUMBETGLOA.fa D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta renaming file to avoid clobbering on windows Unfortunately, both files are in the tag (again, no checkout required): $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta HUMBETGLOA.FASTA HUMBETGLOA.fasta We can remove the offending version from the repository (again, without needing a checkout): $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta I did this, and now the tag checks out fine on OSX. Can anyone confirm? (BTW the ability to operate on the repository w/o needing a checkout is another advantage of svn) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : ===========================================================